Warm tip: This article is reproduced from serverfault.com, please click

how to ignore property of type List when adding properties to vertex

发布于 2020-12-04 19:56:50

I want to add persons as vertices in a graph which works with the following code:

from gremlin_python.process.graph_traversal import __
from gremlin_python.process.traversal import Column

persons = [{"id":1,"name":"bob","age":25}, {"id":2,"name":"joe","age":25,"occupation":"lawyer"}]    

g.inject(persons).unfold().as_('entity').\
    addV('entity').as_('v').\
        sideEffect(__.select('entity').unfold().as_('kv').select('v').\
                   property(__.select('kv').by(Column.keys),
                            __.select('kv').by(Column.values)
                            )
                  ).iterate()

Question 1: What if one of the properties is a List or dict. Example:

persons = [{"id":1,"name":"bob","age":25, "house":{"a":1,"b":4}}, {"id":2,"name":"joe","age":25,"occupation":"lawyer","house":{"a":1,"b":4}}]

How do I ignore that 1 property (house) but still add the rest to the person vertex? Then take house and create another vertex (add properties a and b) with edge to person?

Question 2: What if I want to modify an attribute before I add it as a property to the graph? For example: Convert id into string and then add it as property

Questioner
Ash_inc
Viewed
1
stephen mallette 2020-12-08 21:51:35

I could be wrong, but I sense that your question will end up being more complex than you've posted it. With that in mind, I will offer an answer that works given the assumption that each house is unique which I've made more clear with a "hid" (house id) that I've added to the data.

gremlin> persons = [["pid":1,"name":"bob","age":25, "house":["hid":10,"a":1,"b":4]], 
......1>            ["pid":2,"name":"joe","age":25,"occupation":"lawyer","house":["hid":20,"a":1,"b":4]]]
==>[pid:1,name:bob,age:25,house:[hid:10,a:1,b:4]]
==>[pid:2,name:joe,age:25,occupation:lawyer,house:[hid:20,a:1,b:4]]
gremlin> g.inject(persons).unfold().as('entity').
......1>   addV('entity').as('v').
......2>   sideEffect(select('entity').unfold().as('kv').select('v').
......3>              choose(select('kv').by(keys).is('house'),
......4>                     addV('house').as('h').
......5>                     addE('owns').from('v').
......6>                     select('kv').by(values).unfold().as('hkv').select('h').
......7>                     property(select('hkv').by(keys),
......8>                              select('hkv').by(values)),
......9>                     property(select('kv').by(keys),
.....10>                              select('kv').by(values))))
==>v[0]
==>v[9]
gremlin> g.V().elementMap()
==>[id:0,label:entity,name:bob,pid:1,age:25]
==>[id:4,label:house,a:1,hid:10,b:4]
==>[id:9,label:entity,occupation:lawyer,name:joe,pid:2,age:25]
==>[id:14,label:house,a:1,hid:20,b:4]
gremlin> g.E().elementMap()
==>[id:5,label:owns,IN:[id:4,label:house],OUT:[id:0,label:entity]]
==>[id:15,label:owns,IN:[id:14,label:house],OUT:[id:9,label:entity]]

I've not really done anything new here, in that sense that I've largely just embedded the traversal pattern you were already using within itself. Note that at line 6 I'm just re-doing what was done on line 2 in the sideEffect().

Now, if my assumption was wrong about having unique houses in your data, then things get more complicated because you can't easily inline upsert traversal patterns in this context. Upserts typically involve a fold/coalesce/unfold pattern that immediately conflicts with this "insert only" pattern that you are using as you can't backtrack in a traversal (i.e. refer to a previous step) that is behind a reducing barrier (i.e. fold). I think I would try to restructure the source data in this case to make it more amenable for pure inserts rather than upsert operations.