Warm tip: This article is reproduced from serverfault.com, please click

Adding random relationships in Neo4j only uses single node

发布于 2019-07-06 22:33:27

My structure looks like this:

Person   -[:HAS_HOBBY]->  Hobby

I'm generating e.g. 500 person nodes and 20 hobby nodes randomly and would now like to generate random links in between them so that each person has 1 or more hobbies but not every person has the same one.

CALL apoc.periodic.iterate("
    match (p:Person),(h:Hobby) with p,h limit 1000 
    where rand() < 0.1 RETURN p,h ", 
    "CREATE (p)-[:HAS_HOBBY]->(h)", 
    {batchSize: 20000, parallel: true}) 
YIELD batches, total 
RETURN *

Without the APOC function the query looks like this:

MATCH(p:Person),(h:Hobby)
WITH p,h
LIMIT 10000
WHERE rand() < 0.1
CREATE (p)-[:HAS_HOBBY]->(h)

This is the query I have tried, the problem is that all the person nodes are all linked to one single hobby node so only 1/20 nodes is being used.

Is there anything missing in my query? Or should I tackle this problem in a different way?

I have also tried different approaches with FOREACH clauses looping through all the nodes or using SKIP and LIMIT through a cartesian product

Thanks a lot!

edit:

Query by InverseFalcon using apoc.periodic.iterate:

call apoc.periodic.iterate("
// first generate your range of how many hobbies you want a person to have
// for this example, 1 to 5 hobbies
WITH range(1,5) as hobbiesRange
// next get all hobies in a list
MATCH (h:Hobby)
WITH collect(h) as hobbies, hobbiesRange
MATCH (p:Person)
// randomly pick number of hobbies in the range, use that to get a number of random hobbies
WITH p, apoc.coll.randomItems(hobbies, apoc.coll.randomItem(hobbiesRange)) as hobbies
// create relationships
    RETURN p,hobbies", 
"FOREACH (hobby in hobbies | CREATE (p)-[:HAS_HOBBY]->(hobby))", 
{batchSize: 1000, parallel: false});
Questioner
user6278182
Viewed
0
InverseFalcon 2019-07-08 08:32:14

It would be easier to not use iterate() in this case, but instead use some of APOC's collection helper functions, such as those used to get random items from a collection. Something like this:

// first generate your range of how many hobbies you want a person to have
// for this example, 1 to 5 hobbies
WITH range(1,5) as hobbiesRange
// next get all hobies in a list
MATCH (h:Hobby)
WITH collect(h) as hobbies, hobbiesRange
MATCH (p:Person)
// randomly pick number of hobbies in the range, use that to get a number of random hobbies
WITH p, apoc.coll.randomItems(hobbies, apoc.coll.randomItem(hobbiesRange)) as hobbies
// create relationships
FOREACH (hobby in hobbies | CREATE (p)-[:HAS_HOBBY]->(hobby))