Warm tip: This article is reproduced from stackoverflow.com, please click
amazon-neptune gremlin gremlinpython tinkerpop

Gremlin identifying subsets of populations based on vertices that only have in or out edges

发布于 2020-04-23 15:06:53

I've got a gremlin graph with users who have features. The edges of the graph go out from the users and enter the features. There are no incoming edges to the users and there are no outgoing edges from the features. Each user vertex has many dozen outgoing edges into feature vertices.

I want to find the subset of female users who are connected to feature_a and feature_b vertices. I am using gremlinpython and I know I can do some kind of set intersection in python with the code below. Is there a way in gremlin to achieve this?

q = '''g.V('feature_a').
        inv().has('gender','Female')'''

res1 = client.submit(q).all().result()

q2 = '''g.V('feature_b').
        inv().has('gender','Female')'''

res2 = client.submit(q2).all().result()


vs_in_common = set(res2).intersection(set(res1)))
Questioner
Justin Gerard
Viewed
42
Daniel Kuppitz 2020-02-11 22:28

What Michael posted works, but it's a full scan across all users in the graph (unless gender is indexed, but this would cause other issues, sooner or later). You should rather do something like the following:

g.V().hasId('feature_a').
  in().has('gender', 'Female').
  filter(out().hasId('feature_b'))

Also, if possible, provide an edge label in the in and out steps.