2. Giovanni: Shall we try for queries on IMDB: Who has acted in the same film as X more than N times in the time period between y1 and y2? Who has acted in the same film as X more than N times every year in the time period between years y1 and y2? for y starting at 1990 to 2000 ... end for Who went from being a pure actor without a directorial role to a director without any acting roles? Which director made the most movies between years y1 and y2? Which two people P1 and P2 had the experience that P1 directed P2 and then later P2 directed P1? ========= Sept 4, 2024 15:00 Italy time From now on using zoom: https://nyu.zoom.us/j/93663596718 1. Roberto/Antonio Try to understand why some queries work better with the vector approach and why with the normalized approach. Start with Roberto's queries and then eventually Giovanni's. Dennis suspicion: if the data is in memory and the queries are somewhat complicated then the vector approach is faster. If the above is true, then we can say to the user: if small data, then vector else if you can partition the data, then lots of vector tables else normal form Vector approach table has nodeid, property, property value, vector of start times, vector of end times Normalized approach nodeid, property, property value, start time, end time Schema: https://drive.google.com/drive/folders/1v3yNjNAXuYADiRnUADlw9qSFGWPO71Jh?usp=drive_link 2. Giovanni: Shall we try for queries on IMDB: Who has acted in the same film as X more than N times in the time period between y1 and y2? Who has acted in the same film as X more than N times every year in the time period between years y1 and y2? Loops are possible: for y starting at 1990 to 2000 ... end for Who went from being a pure actor without a directorial role to a director without any acting roles? Which director made the most movies between years y1 and y2? Which two people P1 and P2 had the experience that P1 directed P2 and then later P2 directed P1? ========= Sept 18, 2024 15:00 Italy time From now on using zoom: https://nyu.zoom.us/j/93663596718 1. Roberto/Antonio Roberto has submitted his thesis. Auguri! Try to understand why some queries work better with the vector approach and why with the normalized approach. Start with Roberto's queries and then eventually Giovanni's. 2. Giovanni: Final touches on IMDB queries. Guide the user into what to do if our language doesn't support a particular query. ========= Oct 2, 2024 15:00 Italy time From now on using zoom: https://nyu.zoom.us/j/93663596718 1. Antonio: Convert the parser to support Giovanni's queries to Roberto's backend. Include the group by. How many actors collaborated with actor A in each year from 2006 to 2013 find all actors that collaborated with A over 2006 and 2013 and then put those collaborations and their dates into an sqllite database and do the group by 2. Roberto Continuing with the implementation. Will want to compare normalized approach with neo4j on attribute-value selection queries and on Giovanni's queries. Try to understand why some queries work better with the vector approach and why with the normalized approach. 3. Giovanni: Start from the queries and then change the constants to different constant values. Also change the group by. Final touches on IMDB queries. ========= Oct 9, 2024 15:00 Italy time From now on using zoom: https://nyu.zoom.us/j/93663596718 1. Antonio: Completed the cypher parser. Work with Roberto on the parser. How many actors collaborated with actor A in each year from 2006 to 2013 find all actors that collaborated with A over 2006 and 2013 and then put those collaborations and their dates into an sqllite database and do the group by Collaborate with Giovanni to generate the queries. 2. Roberto Bit matrix should not include time. Continuing with the implementation with some interruption from his thesis. One goal we should always keep in mind is that the implementation will also support modifications and updates (especially new data). 3. Giovanni: Uses citation network and the paper with temporal benchmarks. Nodes are papers and papers with research area. Citation and affiliation and journal. bitcoin trading network collaboration on social network post and comment network ========= Oct 16, 2024 15:00 Italy time From now on using zoom: https://nyu.zoom.us/j/93663596718 1. Antonio: Completed the cypher parser. Work with Roberto on the parser. How many actors collaborated with actor A in each year from 2006 to 2013 find all actors that collaborated with A over 2006 and 2013 and then put those collaborations and their dates into an sqllite database and do the group by Collaborate with Giovanni to generate the queries. 2. Roberto Bit matrix should not include time. Continuing with the implementation with some interruption from his thesis. One goal we should always keep in mind is that the implementation will also support modifications and updates (especially new data). 3. Giovanni: Uses citation network and the paper with temporal benchmarks. Nodes are papers and papers with research area. Citation and affiliation and journal. bitcoin trading network collaboration on social network post and comment network ========= Cari amici, I am skipping Oct 23 because I am in an all-day meeting. Roberto, please send me a summary Also on Oct 30, 2024, we will meet at 14:00 just for that day because Europe changes from daylight savings time one week before we do in the U.S. As of Nov 6, we'll go back to 15:00. Still using zoom: https://nyu.zoom.us/j/93663596718 1. Antonio: Work with Roberto on the parser. How many actors collaborated with actor A in each year from 2006 to 2013 find all actors that collaborated with A over 2006 and 2013 and then put those collaborations and their dates into an sqllite database and do the group by Collaborate with Giovanni to generate the queries for IMDB and the forum benchmark. 2. Roberto Bit matrix should not include time. Continuing with the implementation with some interruption from his thesis. One goal we should always keep in mind is that the implementation will also support modifications and updates (especially new data). 3. Giovanni: Other benchmarks: * Revisit Panama * For future: Maybe a benchmark that uses the neo4j facilities to do transitive closure (please discuss with Roberto) either by calling neo4j or taking their code. The question if we do this is what our contribution would be. Maybe we need to know the state of the art. Perhaps ask Antonio/Giovanni's students to do a literature survey on transitive closure algorithms on graphs (especially time-constrained ones)? - How to go from A to B fastest (or cheapest) starting on this date and time? e.g. Can I get from Fiji to Catania in 24 hours? - Which airports suffer the most cancellations or the most delays - Find alternative routes. ========= Cari amici, Nov 6, 15:00. Zoom: https://nyu.zoom.us/j/93663596718 1. Antonio: Work with Roberto on the parser. How many actors collaborated with actor A in each year from 2006 to 2013 find all actors that collaborated with A over 2006 and 2013 and then put those collaborations and their dates into an sqllite database and do the group by Collaborate with Giovanni to generate the queries for IMDB and the forum benchmark. 2. Roberto Bit matrix should not include time. Continuing with the implementation with some interruption from his thesis. One goal we should always keep in mind is that the implementation will also support modifications and updates (especially new data). 3. Giovanni: Other benchmarks: * Revisit Panama * For future: Maybe a benchmark that uses the neo4j facilities to do transitive closure (please discuss with Roberto) either by calling neo4j or taking their code. The question if we do this is what our contribution would be. Maybe we need to know the state of the art. Perhaps ask Antonio/Giovanni's students to do a literature survey on transitive closure algorithms on graphs (especially time-constrained ones)? - How to go from A to B fastest (or cheapest) starting on this date and time? e.g. Can I get from Fiji to Catania in 24 hours? - Which airports suffer the most cancellations or the most delays - Find alternative routes. ========= Cari amici, Nov 13, 15:00. Zoom: https://nyu.zoom.us/j/93663596718 1. Antonio: Will take over the code from Roberto but he will focus on his other projects till January to see if he can free himself up for our project then. He may also try to recruit a new student whom he can direct. Collaborate with Giovanni to generate the queries for IMDB and the forum benchmark. 2. Roberto Bit matrix should not include time. Put the code in good shape so that Antonio can take over. One goal we should always keep in mind is that the implementation will also support modifications and updates (especially new data). 3. Giovanni: Other benchmarks: * Revisit Panama * For future: Maybe a benchmark that uses the neo4j facilities to do transitive closure (please discuss with Roberto) either by calling neo4j or taking their code. The question if we do this is what our contribution would be. Maybe we need to know the state of the art. Perhaps ask Antonio/Giovanni's students to do a literature survey on transitive closure algorithms on graphs (especially time-constrained ones)? - How to go from A to B fastest (or cheapest) starting on this date and time? e.g. Can I get from Fiji to Catania in 24 hours? - Which airports suffer the most cancellations or the most delays - Find alternative routes. ========= Dec 4, 15:00. Zoom: https://nyu.zoom.us/j/93663596718 1. Antonio: Will take over the code from Roberto but he will focus on his other projects till January to see if he can free himself up for our project then. He may also try to recruit a new student whom he can direct. Collaborate with Giovanni to generate the queries for IMDB and the forum benchmark. 2. Roberto Bit matrix should not include time. Put the code in good shape so that Antonio can take over. One goal we should always keep in mind is that the implementation will also support modifications and updates (especially new data). 3. Giovanni: Other benchmarks: * Revisit Panama * For future: Maybe a benchmark that uses the neo4j facilities to do transitive closure (please discuss with Roberto) either by calling neo4j or taking their code. The question if we do this is what our contribution would be. Maybe we need to know the state of the art. Perhaps ask Antonio/Giovanni's students to do a literature survey on transitive closure algorithms on graphs (especially time-constrained ones)? - How to go from A to B fastest (or cheapest) starting on this date and time? e.g. Can I get from Fiji to Catania in 24 hours? - Which airports suffer the most cancellations or the most delays - Find alternative routes. ========= Jan 15, 2025 15:00 Italy time Cari amici, I will be returning on Jan 15, so may not make it on time, but I hope to. Please meet in any case. Jan 15, 2025 15:00. Zoom: https://nyu.zoom.us/j/93663596718 1. Antonio/Roberto: Antonio will take over the code from Roberto. Please send to Krish the API when available, also the current Panama queries. 2. Giovanni/Alfredo: Alfredo reads the multigraphmatch paper and then Giovanni submits. 3. Krish Jain Krish, we're thinking you could start by helping with benchmarking security-like scenarios using the panama papers. Here were some that we thought of before: Whom did X email with in some date range If X and Y are involved then who is in common that they both emailed with before a certain date. Has X sent email to more than 10 people in any 4 hour interval? How many people have sent emails to X in these six hours? Is there a sequence of emails linking X to Y such that the sequence is ordered in time, e.g. X sent to A on day 1, A sent to B on day 2, .... Z sent to X on day 25. What if that sequence cannot be more than k long? What if the interval between messages in that sequence must be at least one hour but no more than one day? Find the list of everyone who transitively sent to X (either directly to X or through intermediaries) over the last four days? Among this set of people which pairs have communicated the most over the last three days? Define a session as an email exchange between X and Y such that there is a communication from X to Y or from Y to X every day or more frequently. How many sessions are there between X and Y in the last month. A set of known lawyers. A set of other people who at one point met the lawyers. Are there any pair of people P1 and P2 that have both used this lawyer and that bank in the past and such that P1 and P2 later exchanged emails?