borda_ranker

Borda grouped-ranking ranker. Ranks each item by summing, across groups, the number of same-group items with strictly lower relevance.

The library implements the ranker_protocol defined in the ranking_protocols library. It provides predicates for learning a ranker from grouped relevance judgments, using it to order candidate items, and exporting it as a list of predicate clauses or to a file.

Datasets are represented as objects implementing the ranking_dataset_protocol protocol from the ranking_protocols library. See the test_datasets directory for examples. ‘The training dataset must declare each group once, use only declared groups and items in relevance judgments, and assign non-negative integer relevance values.

API documentation

Open the ../../apis/library_index.html#borda_ranker link in a web browser.

Loading

To load this library, load the loader.lgt file:

| ?- logtalk_load(borda_ranker(loader)).

Testing

To test this library predicates, load the tester.lgt file:

| ?- logtalk_load(borda_ranker(tester)).

To run the performance benchmark suite, load the tester_performance.lgt file:

| ?- logtalk_load(borda_ranker(tester_performance)).

Features

  • Grouped Relevance Learning: Learns one deterministic item score from grouped ranking or relevance-judgment datasets.

  • Portable Borda Scoring: Computes scores using only non-negative integer grouped relevance judgments and standard Logtalk library predicates. Within each group, an item receives one point for every same-group item with strictly lower relevance when using tie_scoring(standard) and the average of the minimum and maximum tied positions when using tie_scoring(fractional).

  • Deterministic Ranking: Orders candidate items by learned score with deterministic tie-breaking. Ranking ties are broken deterministically using the standard term order of the item identifiers after sorting by descending score.

  • Missing relevance semantics: Missing relevance facts are treated as zero by default using the missing_relevance(zero) option and can be rejected using missing_relevance(error).

  • Strict Dataset Validation: Rejects duplicate groups, duplicate items within a group, undeclared groups or items in relevance judgments, and non-integer or negative relevance values.

  • Explicit Semantics Options: The learn/3 predicate exposes the current tie and missing-relevance policies using the tie_scoring/1 and missing_relevance/1 options.

  • Benchmark Coverage: Includes a dedicated performance suite for a large grouped dataset benchmark.

  • Training Diagnostics: Learned rankers include dataset summary metadata that can be accessed using the diagnostics/2 predicate.

  • Ranker Export: Learned rankers can be exported as self-contained terms.

  • Shared Ranking Infrastructure: Uses the common ranking_protocols helper predicates for option processing, dataset validation, diagnostics, export, and candidate ranking.

Scoring semantics

This implementation uses a grouped Borda count variant over the declared items of each group. With the default tie_scoring(standard) option, an item receives one point for every same-group item with strictly lower relevance. Tied items therefore receive the same per-group contribution, because equal relevance values do not add or subtract points.

With the tie_scoring(fractional) option, each tied relevance class receives the average of the minimum and maximum per-group Borda points available to that tie block. For example, when two items tie above a single lower-ranked item, both tied items receive 1.5 points instead of the 1 point assigned by the default policy.

Missing relevance facts are treated as relevance 0 only for items that are declared in the group when using the default missing_relevance(zero) option. This allows grouped datasets to omit explicit zero judgments while keeping the score computation deterministic. Use missing_relevance(error) to reject grouped datasets that omit a declared item relevance.

Usage

Learning a ranker

% Learn from a grouped ranking dataset object
| ?- borda_ranker::learn(my_dataset, Ranker).
...

% Learn with an explicit empty options list
| ?- borda_ranker::learn(my_dataset, Ranker, []).
...

% Learn while requiring every declared group item to have a relevance fact
| ?- borda_ranker::learn(my_dataset, Ranker, [missing_relevance(error)]).
...

% Learn using fractional tie scoring for tied relevance levels
| ?- borda_ranker::learn(my_dataset, Ranker, [tie_scoring(fractional)]).
...

The current implementation accepts the missing_relevance/1 and tie_scoring/1 options described below.

Inspecting diagnostics

% Inspect model and dataset summary metadata
| ?- borda_ranker::learn(my_dataset, Ranker),
     borda_ranker::diagnostics(Ranker, Diagnostics).
Diagnostics = [...]
...

Ranking candidate items

% Rank a candidate set from most preferred to least preferred
| ?- borda_ranker::learn(my_dataset, Ranker),
     borda_ranker::rank(Ranker, [item_a, item_b, item_c], Ranking).
Ranking = [...]
...

Candidate lists must be proper lists of unique, ground items declared by the training dataset. Invalid ranker terms, duplicate candidates, and candidates containing variables are rejected with errors instead of being silently accepted.

Exporting the ranker

Learned rankers can be exported as a list of clauses or to a file for later use.

% Export as predicate clauses
| ?- borda_ranker::learn(my_dataset, Ranker),
     borda_ranker::export_to_clauses(my_dataset, Ranker, my_ranker, Clauses).
Clauses = [my_ranker(borda_ranker(...))]
...

% Export to a file
| ?- borda_ranker::learn(my_dataset, Ranker),
     borda_ranker::export_to_file(my_dataset, Ranker, my_ranker, 'ranker.pl').
...

Diagnostics syntax

The diagnostics/2 predicate returns a list of metadata terms with the form:

[
    model(borda_ranker),
    options(Options),
    dataset_summary(DatasetSummary)
]

Where:

  • model(borda_ranker) identifies the learning algorithm that produced the ranker.

  • options(Options) stores the effective learning options after merging the user options with the library defaults.

  • dataset_summary(DatasetSummary) stores a summary list describing the validated training dataset.

The current dataset_summary/1 payload has the form:

[
    groups(NumberOfGroups),
    items(NumberOfItems),
    relevance_judgments(NumberOfJudgments)
]

Use the ranking_protocols diagnostic/2 and ranker_options/2 helper predicates when you only need a single metadata term or the effective options.

Options

The following options can be passed to the learn/3 predicate:

  • missing_relevance(Policy): Controls how declared group items without an explicit relevance fact are handled. The supported values are zero (default) and error.

  • tie_scoring(Policy): Controls the grouped Borda tie semantics. The current implementation supports standard (minimum tied-block score) and fractional (average tied-block score).

Ranker representation

The learned ranker is represented by a compound term of the form:

borda_ranker(Items, Scores, Diagnostics)

Where:

  • Items: List of ranked items.

  • Scores: List of Item-Score pairs.

  • Diagnostics: List of metadata terms, including the effective options and dataset summary.

When exported using export_to_clauses/4 or export_to_file/4, this ranker term is serialized directly as the single argument of the generated predicate clause so that the exported model can be loaded and reused as-is.

References

  1. de Borda, J.-C. (1781). Mémoire sur les élections au scrutin. Histoire de l’Académie Royale des Sciences.