Neo4j Snowplow Integration
Neo4j is the world's leading open-source graph database. Graph databases are widely used in enterprise fraud detection, real-time recommendations, social networking, marketing attribution, identity resolution, etc. Neo4j is democratizing access to graph databases with AuraDB, an affordable cloud-hosted Neo4j.
At SnowcatCloud, we believe in the rise of graph databases, as they provide insights into the relationships between entities in a way that relational databases can't do.
We created a Snowplow Neo4j integration that streams Snowplow behavioral event data into your instance of Neo4j, allowing our customers to create and maintain a graph database with their behavioral event data in minutes.
How does it work?
Events tracked in your Snowplow data stream are transformed and replicated into your Neo4j in real-time, which you can query using Bloom or CYPHER.
Supported event types
Event | Event Name | Vendor |
---|---|---|
page_view | page_view | com.snowplowanalytics.snowplow |
transaction | transaction | com.snowplowanalytics.snowplow |
transaction_item | transaction_item | com.snowplowanalytics.snowplow |
Neo4j Snowplow Integration
Out of the box all supported event types are transformed and sent to Neo4j in real-time. This integration enables the creation and maintenance of a behavioral identity graph in Neo4j with minimal effort.
Real-time Neo4j Graph Update
You can create and update your graph by sending self-decribing events with the com.snowcatcloud.iceberg
schema to your Snowplow collector.
As the events pass through your SnowcatCloud account, we transform and forward them to Neo4j in real-time.
Identifier
The identifier id property is unique per node and defined by the fields listed below:
- FingerprintJS
visitorId
- Cookies if Available: (
_gaexp
,cart
,ajs_user_id
,ajs_anonymous_id
) - Snowplow Cookies:
domain_user
,network_userid
, - Snowplow
user_id
<script>
// Snowplow JS Tracker V3.x
// Lookup a node with existinging identifier id, all devices linked to it
// are given an additional new identifier id
snowplow("trackSelfDescribingEvent", {
event: {
schema: "iglu:com.snowcatcloud.iceberg/identifier/jsonschema/1-0-0",
data: {
lookup_id: "existing identifier id",
source: "new identifier id source",
name: "new identifier id name",
id: "new identifier id",
},
},
});
</script>
Bulk Neo4j Graph Update
You can also enrich your behavioral identity graph with Terabytes of offline data, either yours or from third-party providers (Tapad, Experian, Verizon, etc.), by uploading CSV files into an S3 bucket.
Identifier
The identifiers bulk graph update enables you to enrich existing identifiers by looking up an existing identifier and adding a new identifier to all the connected devices.
The identifier id property is unique per node and defined by the fields listed below:
- FingerprintJS
visitorId
- Cookies if Available: (
_gaexp
,cart
,ajs_user_id
,ajs_anonymous_id
) - Snowplow Cookies:
domain_user
,network_userid
, - Snowplow
user_id
The example below illustrates a user who submits a form, creating a data entry that associates their cookie with personal data. Note this enrichment can happen in real-time OR/AND bulk. The goal is to tie as many identifiers are possible to aggregate customer behavior across devices.
File Upload Requirements
SnowcatCloud customers are provided with a dedicated encrypted S3 bucket to upload data files. Data is processed in real-time or in batch mode.
- No headers
- CSV file(s), gzipped
- All columns are mandatory
Lookup Identifier Id | Source | Name | New Identifier Id |
---|---|---|---|
A | salesforce | phonenumber | B |
Example:
52147316-857b-489b-affd-b40dc7aead94,tapad.email.hash,email,2238fe6d9aa0a9de
b0bffd39-c6fc-46ab-9c75-659886f2bb31,tapad.email.hash,email,73f5f793711859cf
...