RDF Model

Oxigraph provides python classes to represents basic RDF concepts:

IRIs

class pyoxigraph.NamedNode(value)

An RDF node identified by an IRI.

Parameters:

value (str) – the IRI as a string.

Raises:

ValueError – if the IRI is not valid according to RFC 3987.

The str function provides a serialization compatible with NTriples, Turtle, and SPARQL:

>>> str(NamedNode('http://example.com'))
'<http://example.com>'
value
Returns:

the named node IRI.

Return type:

str

>>> NamedNode("http://example.com").value
'http://example.com'

Blank Nodes

class pyoxigraph.BlankNode(value=None)

An RDF blank node.

Parameters:

value (str or None, optional) – the blank node identifier (if not present, a random blank node identifier is automatically generated).

Raises:

ValueError – if the blank node identifier is invalid according to NTriples, Turtle, and SPARQL grammars.

The str function provides a serialization compatible with NTriples, Turtle, and SPARQL:

>>> str(BlankNode('ex'))
'_:ex'
value
Returns:

the blank node identifier.

Return type:

str

>>> BlankNode("ex").value
'ex'

Literals

class pyoxigraph.Literal(value, *, datatype=None, language=None)

An RDF literal.

Parameters:
Raises:

ValueError – if the language tag is not valid according to RFC 5646 (BCP 47).

The str function provides a serialization compatible with NTriples, Turtle, and SPARQL:

>>> str(Literal('example'))
'"example"'
>>> str(Literal('example', language='en'))
'"example"@en'
>>> str(Literal('11', datatype=NamedNode('http://www.w3.org/2001/XMLSchema#integer')))
'"11"^^<http://www.w3.org/2001/XMLSchema#integer>'
datatype
Returns:

the literal datatype IRI.

Return type:

NamedNode

>>> Literal('11', datatype=NamedNode('http://www.w3.org/2001/XMLSchema#integer')).datatype
<NamedNode value=http://www.w3.org/2001/XMLSchema#integer>
>>> Literal('example').datatype
<NamedNode value=http://www.w3.org/2001/XMLSchema#string>
>>> Literal('example', language='en').datatype
<NamedNode value=http://www.w3.org/1999/02/22-rdf-syntax-ns#langString>
language
Returns:

the literal language tag.

Return type:

str or None

>>> Literal('example', language='en').language
'en'
>>> Literal('example').language
value
Returns:

the literal value or lexical form.

Return type:

str

>>> Literal("example").value
'example'

Triples

class pyoxigraph.Triple(subject, predicate, object)

An RDF triple.

Parameters:

The str function provides a serialization compatible with NTriples, Turtle, and SPARQL:

>>> str(Triple(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1')))
'<http://example.com> <http://example.com/p> "1"'

A triple could also be easily destructed into its components:

>>> (s, p, o) = Triple(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'))
object
Returns:

the triple object.

Return type:

NamedNode or BlankNode or Literal or Triple

>>> Triple(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1')).object
<Literal value=1 datatype=<NamedNode value=http://www.w3.org/2001/XMLSchema#string>>
predicate
Returns:

the triple predicate.

Return type:

NamedNode

>>> Triple(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1')).predicate
<NamedNode value=http://example.com/p>
subject
Returns:

the triple subject.

Return type:

NamedNode or BlankNode or Triple

>>> Triple(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1')).subject
<NamedNode value=http://example.com>

Quads (triples in a RDF dataset)

class pyoxigraph.Quad(subject, predicate, object, graph_name=None)

An RDF triple. in a RDF dataset.

Parameters:

The str function provides a serialization compatible with NTriples, Turtle, and SPARQL:

>>> str(Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g')))
'<http://example.com> <http://example.com/p> "1" <http://example.com/g>'
>>> str(Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), DefaultGraph()))
'<http://example.com> <http://example.com/p> "1"'

A quad could also be easily destructed into its components:

>>> (s, p, o, g) = Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g'))
graph_name
Returns:

the quad graph name.

Return type:

NamedNode or BlankNode or DefaultGraph

>>> Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g')).graph_name
<NamedNode value=http://example.com/g>
object
Returns:

the quad object.

Return type:

NamedNode or BlankNode or Literal or Triple

>>> Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g')).object
<Literal value=1 datatype=<NamedNode value=http://www.w3.org/2001/XMLSchema#string>>
predicate
Returns:

the quad predicate.

Return type:

NamedNode

>>> Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g')).predicate
<NamedNode value=http://example.com/p>
subject
Returns:

the quad subject.

Return type:

NamedNode or BlankNode or Triple

>>> Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g')).subject
<NamedNode value=http://example.com>
triple
Returns:

the quad underlying triple.

Return type:

Triple

>>> Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g')).triple
<Triple subject=<NamedNode value=http://example.com> predicate=<NamedNode value=http://example.com/p> object=<Literal value=1 datatype=<NamedNode value=http://www.w3.org/2001/XMLSchema#string>>>
class pyoxigraph.DefaultGraph

The RDF default graph name.

Datasets

class pyoxigraph.Dataset(quads=None)

An in-memory RDF dataset.

It can accommodate a fairly large number of quads (in the few millions).

Use Store if you need on-disk persistence or SPARQL.

Warning: It interns the strings and does not do any garbage collection yet: if you insert and remove a lot of different terms, memory will grow without any reduction.

Parameters:

quads (collections.abc.Iterable[Quad] or None, optional) – some quads to initialize the dataset with.

The str function provides an N-Quads serialization:

>>> str(Dataset([Quad(NamedNode('http://example.com/s'), NamedNode('http://example.com/p'), NamedNode('http://example.com/o'), NamedNode('http://example.com/g'))]))
'<http://example.com/s> <http://example.com/p> <http://example.com/o> <http://example.com/g> .\n'
add(quad)

Adds a quad to the dataset.

Parameters:

quad (Quad) – the quad to add.

Return type:

None

>>> quad = Quad(NamedNode('http://example.com/s'), NamedNode('http://example.com/p'), NamedNode('http://example.com/o'), NamedNode('http://example.com/g'))
>>> dataset = Dataset()
>>> dataset.add(quad)
>>> quad in dataset
True
canonicalize(algorithm)

Canonicalizes the dataset by renaming blank nodes.

Warning: Blank node ids depends on the current shape of the graph. Adding a new quad might change the ids of a lot of blank nodes. Hence, this canonization might not be suitable for diffs.

Warning: This implementation worst-case complexity is in O(b!) with b the number of blank nodes in the input dataset.

Parameters:

algorithm (CanonicalizationAlgorithm) – the canonicalization algorithm to use.

Return type:

None

>>> d1 = Dataset([Quad(BlankNode(), NamedNode('http://example.com/p'), BlankNode())])
>>> d2 = Dataset([Quad(BlankNode(), NamedNode('http://example.com/p'), BlankNode())])
>>> d1 == d2
False
>>> d1.canonicalize(CanonicalizationAlgorithm.UNSTABLE)
>>> d2.canonicalize(CanonicalizationAlgorithm.UNSTABLE)
>>> d1 == d2
True
clear()

Removes all quads from the dataset.

Return type:

None

>>> quad = Quad(NamedNode('http://example.com/s'), NamedNode('http://example.com/p'), NamedNode('http://example.com/o'), NamedNode('http://example.com/g'))
>>> dataset = Dataset([quad])
>>> dataset.clear()
>>> len(dataset)
0
discard(quad)

Removes a quad from the dataset if it is present.

Parameters:

quad (Quad) – the quad to remove.

Return type:

None

>>> quad = Quad(NamedNode('http://example.com/s'), NamedNode('http://example.com/p'), NamedNode('http://example.com/o'), NamedNode('http://example.com/g'))
>>> dataset = Dataset([quad])
>>> dataset.discard(quad)
>>> quad in dataset
False
quads_for_graph_name(graph_name)

Looks for the quads with the given graph name.

Parameters:

graph_name (NamedNode or BlankNode or DefaultGraph) – the quad graph name.

Returns:

an iterator of the quads.

Return type:

collections.abc.Iterator[Quad]

>>> store = Dataset([Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g'))])
>>> list(store.quads_for_graph_name(NamedNode('http://example.com/g')))
[<Quad subject=<NamedNode value=http://example.com> predicate=<NamedNode value=http://example.com/p> object=<Literal value=1 datatype=<NamedNode value=http://www.w3.org/2001/XMLSchema#string>> graph_name=<NamedNode value=http://example.com/g>>]
quads_for_object(object)

Looks for the quads with the given object.

Parameters:

object (NamedNode or BlankNode or Literal or Triple) – the quad object.

Returns:

an iterator of the quads.

Return type:

collections.abc.Iterator[Quad]

>>> store = Dataset([Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g'))])
>>> list(store.quads_for_object(Literal('1')))
[<Quad subject=<NamedNode value=http://example.com> predicate=<NamedNode value=http://example.com/p> object=<Literal value=1 datatype=<NamedNode value=http://www.w3.org/2001/XMLSchema#string>> graph_name=<NamedNode value=http://example.com/g>>]
quads_for_predicate(predicate)

Looks for the quads with the given predicate.

Parameters:

predicate (NamedNode) – the quad predicate.

Returns:

an iterator of the quads.

Return type:

collections.abc.Iterator[Quad]

>>> store = Dataset([Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g'))])
>>> list(store.quads_for_predicate(NamedNode('http://example.com/p')))
[<Quad subject=<NamedNode value=http://example.com> predicate=<NamedNode value=http://example.com/p> object=<Literal value=1 datatype=<NamedNode value=http://www.w3.org/2001/XMLSchema#string>> graph_name=<NamedNode value=http://example.com/g>>]
quads_for_subject(subject)

Looks for the quads with the given subject.

Parameters:

subject (NamedNode or BlankNode or Triple) – the quad subject.

Returns:

an iterator of the quads.

Return type:

collections.abc.Iterator[Quad]

>>> store = Dataset([Quad(NamedNode('http://example.com'), NamedNode('http://example.com/p'), Literal('1'), NamedNode('http://example.com/g'))])
>>> list(store.quads_for_subject(NamedNode('http://example.com')))
[<Quad subject=<NamedNode value=http://example.com> predicate=<NamedNode value=http://example.com/p> object=<Literal value=1 datatype=<NamedNode value=http://www.w3.org/2001/XMLSchema#string>> graph_name=<NamedNode value=http://example.com/g>>]
remove(quad)

Removes a quad from the dataset and raises an exception if it is not in the set.

Parameters:

quad (Quad) – the quad to remove.

Return type:

None

Raises:

KeyError – if the element was not in the set.

>>> quad = Quad(NamedNode('http://example.com/s'), NamedNode('http://example.com/p'), NamedNode('http://example.com/o'), NamedNode('http://example.com/g'))
>>> dataset = Dataset([quad])
>>> dataset.remove(quad)
>>> quad in dataset
False
class pyoxigraph.CanonicalizationAlgorithm

RDF canonicalization algorithms.

The following algorithms are supported:

  • CanonicalizationAlgorithm.UNSTABLE: an unstable algorithm preferred by PyOxigraph.