Large RDF import fails despite big memory allocation

I tried importing an RDF file with 10K's of triples and nodes using the following command:

CALL n10s.onto.import.fetch("", "RDF/XML",{commitSize: 100000, nodeCacheSize: 20000});

I cannot get the ontology to load and instead get variations of the "partial commit" error where the number of triplesLoaded changes depending on the callParams:

I ran the memrec command as recommended in an earlier thread and changed the following config settings on a large AWS machine to ensure it wasn't resource bound:


But this still results in the same error.

The log file has this exception:$InvalidCacheLoadException: CacheLoader returned null for key

Thanks in advance for any insights or suggestions!

Hi @doug.lee.415
Could you share your GraphConfig, so we can try to reproduce please?

I mean the n10s.graphConfig… that you run before the n10s.rdf.import…



I just used the default
CALL n10s.graphconfig.init();
since I put the commitSize and nodeCacheSize as params in the import.



So I managed to reproduce it and there was indeed a bug that has already been fixed in 4.3 and 4.4. Check #245.

However, to save you from having to build from source or wait for the next release for the import to work correctly, here's a "black magic" :grinning: workaround:
Before running the CALL n10s.onto.import.fetch(... you try the following :

call n10s.onto.import.inline("
@prefix owl: <> .
@prefix obo: <> .

obo:FOODON_03307539 a owl:Class . 
obo:CHEBI_16236 a owl:Class . 
obo:FOODON_03301625 a owl:Class . 
obo:FOODON_00001258 a owl:Class . 
obo:FOODON_03305364 a owl:Class . 

This should just import 5 classes that are not explicitly declared as such in the foodon.owl file.
Once this is done, your can try again the full import and it should complete nicely.

Let us know if this solves the problem for you.

Thanks for sharing your experience and making n10s better :pray:


Excellent! This works beautifully! Thanks for your expert assistance.