Sorting Algorithms¶

Sort by Corpus Position¶

Path: flexiconc/algorithms/sort_by_corpus_position.py

Description:

Sorts the concordance lines by their line_id, which corresponds to their position in the corpus.

Arguments:

No arguments defined.

Show full JSON schema

{
  "type": "object",
  "properties": {},
  "required": []
}

Sort by Token-Level Attribute¶

Path: flexiconc/algorithms/sort_by_token_attribute.py

Description:

Sorts the concordance lines by the given token-level attribute using locale-specific sorting (default 'en'). Supports sorting by a single token at a given offset, or by whole left/right context by joining tokens. When sorting by left context, tokens are joined from right to left. Optionally reverses strings for right-to-left sorting.

Arguments:

Name	Type	Description
tokens_attribute	string	The token attribute to sort by.
sorting_scope	string	Specifies which context to use for sorting: 'token' for a single token at the given offset, 'left' for the entire left context (joined from right to left), or 'right' for the entire right context.
offset	integer	The offset value to filter tokens by when sorting_scope is 'token'.
case_sensitive	boolean	If True, performs a case-sensitive sort.
reverse	boolean	If True, sort in descending order.
backwards	boolean	If True, reverses the string (e.g., for right-to-left sorting).
locale_str	string	ICU locale string for language-specific sorting.

Show full JSON schema

{
  "type": "object",
  "properties": {
    "tokens_attribute": {
      "type": "string",
      "description": "The token attribute to sort by.",
      "default": "word",
      "x-eval": "dict(enum=list(set(conc.tokens.columns) - {'id_in_line', 'line_id', 'offset'}))"
    },
    "sorting_scope": {
      "type": "string",
      "description": "Specifies which context to use for sorting: 'token' for a single token at the given offset, 'left' for the entire left context (joined from right to left), or 'right' for the entire right context.",
      "default": "token",
      "enum": [
        "token",
        "left",
        "right"
      ]
    },
    "offset": {
      "type": "integer",
      "description": "The offset value to filter tokens by when sorting_scope is 'token'.",
      "default": 0,
      "x-eval": "dict(minimum=min(conc.tokens['offset']), maximum=max(conc.tokens['offset']))"
    },
    "case_sensitive": {
      "type": "boolean",
      "description": "If True, performs a case-sensitive sort.",
      "default": false
    },
    "reverse": {
      "type": "boolean",
      "description": "If True, sort in descending order.",
      "default": false
    },
    "backwards": {
      "type": "boolean",
      "description": "If True, reverses the string (e.g., for right-to-left sorting).",
      "default": false
    },
    "locale_str": {
      "type": "string",
      "description": "ICU locale string for language-specific sorting.",
      "default": "en"
    }
  },
  "required": [
    "tokens_attribute",
    "sorting_scope"
  ]
}

Random Sort¶

Path: flexiconc/algorithms/sort_random.py

Description:

Sorts lines in a pseudo-random but stable manner. Given a seed, any pair of line_ids always appear in the same relative order, independent of the presence of other lines.

Arguments:

Name	Type	Description
seed	['integer']	An optional seed for generating the pseudo-random order.

Show full JSON schema

{
  "type": "object",
  "properties": {
    "seed": {
      "type": [
        "integer"
      ],
      "description": "An optional seed for generating the pseudo-random order.",
      "default": 42
    }
  },
  "required": [
    "seed"
  ]
}