I have loaded the entire IMDB catalog onto a database in SQL Server 15.0.4102.2 Developer edition. It is a quad core i7, 32GB of RAM. This query/data I will describe is not maxing out memory (I don't have a max set, but in performance monitor, SQL Server is using 3.9 GB of ram, and I still have 16.9GB free. When I execute this query:
SELECT LAST_VALUE(Person2.Name) WITHIN GROUP (GRAPH PATH) AS ConnectedToAccountHandle
FROM Imdb.Person, Imdb.ContributedTo FOR PATH AS ContributedTo, Imdb.Title FOR PATH AS Title, Imdb.ContributedTo FOR PATH AS ContributedTo2, Imdb.Person FOR PATH AS Person2
WHERE MATCH(SHORTEST_PATH(Person(-(ContributedTo)->Title<-(ContributedTo2)-Person2)+)) AND Person.PrimaryName = 'Fred Astaire'
I get the following, VERY quickly: Msg 596, Level 21, State 1, Line 8 Cannot continue the execution because the session is in the kill state. Msg 0, Level 20, State 0, Line 8 A severe error occurred on the current command. The results, if any, should be discarded. The data is there, as this:
SELECT Person.PrimaryName, Person2.PrimaryName, Title.Name
ROM Imdb.Person, Imdb.ContributedTo AS ContributedTo, Imdb.Title AS Title, Imdb.ContributedTo AS ContributedTo2, Imdb.Person AS Person2
WHERE MATCH(Person-(ContributedTo)->Title<-(ContributedTo2)-Person2) AND Person.PrimaryName = 'Fred Astaire';
Returns 2681 connections. It is not really using a lot of memory or CPU to return this data, and the failure is immediate. The syntax is seemingly fine. I have done this on a smaller scale with the following query:
SELECT LAST_VALUE(Account2.AccountHandle) WITHIN GROUP (GRAPH PATH) AS ConnectedToAccountHandle FROM SocialGraph.Account AS Account1 ,SocialGraph.Account FOR PATH AS Account2 ,SocialGraph.Interest FOR PATH AS Interest ,SocialGraph.InterestedIn FOR PATH AS InterestedIn ,SocialGraph.InterestedIn FOR PATH AS InterestedIn2 --Account1 is interested in an interest, and Account2 is also
WHERE MATCH(SHORTEST_PATH(Account1(-(InterestedIn)->Interest<-(InterestedIn2)-Account2)+)) -- The interesting part
AND Account1.AccountHandle = '@home '
The structure to my imdb tables is:
CREATE TABLE [Imdb].Title
AS NODE WITH (DATA_COMPRESSION = PAGE)
GO
CREATE TABLE [Imdb].Person
AS NODE WITH (DATA_COMPRESSION = PAGE)
GO
CREATE TABLE [Imdb].ContributedTo
AS EDGE WITH (DATA_COMPRESSION = PAGE)
GO
CREATE INDEX fromTo ON Imdb.ContributedTo ($from_id, $to_id)
CREATE INDEX ToFrom ON Imdb.ContributedTo ($to_id, $from_id)
GO
There is 10,756,900 people, and 43,312,922 contributed to rows, spread across 12 different types of contribution: director actor composer editor archive_footage actress writer self archive_sound cinematographer producer production_designer So it is a lot of data, and if it was churning and not finishing for a LONG time, I would probably understand. I added indexes on the from and to columns and fetching one level of connection is subsecond fast...