Bigquery to get Speed based on time/location data (rows above/below current row)How to get the current time in PythonGet current time in milliseconds in Python?BigQuery Data Location Settingget list of pandas dataframe columns based on data typeCondition based on nth row above or below current row - PythonIs there a way to resample time series data in BigQuery?BigQuery speed - time from __SHUFFLE1_REPARTITION0BigQuery: SELECT in WHERE-clause with filter based on a value in the current rowbigquery for time series dataPandas time operations

Does the fruit of Mantra Japa automatically go to Indra if Japa Samarpana Mantra is not chanted?

Why are electrically insulating heatsinks so rare? Is it just cost?

How is it possible to have an ability score that is less than 3?

Risk of getting Chronic Wasting Disease (CWD) in the United States?

"to be prejudice towards/against someone" vs "to be prejudiced against/towards someone"

How much RAM could one put in a typical 80386 setup?

How is the claim "I am in New York only if I am in America" the same as "If I am in New York, then I am in America?

LaTeX closing $ signs makes cursor jump

Why was the small council so happy for Tyrion to become the Master of Coin?

How do we improve the relationship with a client software team that performs poorly and is becoming less collaborative?

Today is the Center

Why can't I see bouncing of a switch on an oscilloscope?

Why dont electromagnetic waves interact with each other?

What are these boxed doors outside store fronts in New York?

Approximately how much travel time was saved by the opening of the Suez Canal in 1869?

Service Entrance Breakers Rain Shield

Font hinting is lost in Chrome-like browsers (for some languages )

Did Shadowfax go to Valinor?

Minkowski space

What would happen to a modern skyscraper if it rains micro blackholes?

Are the number of citations and number of published articles the most important criteria for a tenure promotion?

Fencing style for blades that can attack from a distance

Why not use SQL instead of GraphQL?

strToHex ( string to its hex representation as string)



Bigquery to get Speed based on time/location data (rows above/below current row)


How to get the current time in PythonGet current time in milliseconds in Python?BigQuery Data Location Settingget list of pandas dataframe columns based on data typeCondition based on nth row above or below current row - PythonIs there a way to resample time series data in BigQuery?BigQuery speed - time from __SHUFFLE1_REPARTITION0BigQuery: SELECT in WHERE-clause with filter based on a value in the current rowbigquery for time series dataPandas time operations






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








3















I have a table in Bigquery with tracking data for Nascar drivers (dummy data for a project I am working on). The x and y coordinates are taken 10 times a second. The capture_frame signifies the current frame, and each sequential capture_frame should be 100 milliseconds apart because the data is taken every 100 ms.



I want to calculate each driver's speed per lap. I know how to do this in pandas but I think this is possible in bigquery. To calculate speed, I am looking at 2 rows before capture_frame and 2 rows after and then dividing by the difference in epoch time, which should be 400 milliseconds.



Here is an example of a few capture frames for 1 race for one driver for the first lap. There are a few hundred capture frames per lap and then 20 drivers mixed in as well, but it is easier to understand if we look at just one driver/race/lap.



+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| Race | Capture | Lap | Driver | … | X | Y | Epoch_time | Delta_dist | Curr_speed |
| | _frame | | | | | | | | |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 1 | 1 | Logano | …. | 2.1 | 1 | 1552089720 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 2 | 1 | Logano | … | 2.2 | 1.1 | 1552089820 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 3 | 1 | Logano | … | 2.22 | 1.2 | 1552089920 | 2.265921446 | 0.005664804 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 4 | 1 | Logano | .. | 3.22 | 1.5 | 1552090020 | 3.124163888 | 0.00781041 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 5 | 1 | Logano | .. | 4.22 | 1.8 | 1552090120 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 6 | 1 | Logano | .. | 5.22 | 1.9 | 1552090220 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+


The delta_dist for frame 3 is calculated by sqrt((4.22-2.1)^2 + (1.8-1)^2)/1 and the the curr_speed is that number divided by 400. The first/last 2 distances and speeds of the race will be null because there are no prior x or y coordinates which is okay as there isn't really any speed when you are .1 second from starting or stopping.



In pandas I would do (this is not great code as I am just bringing each driver and race in on its own):



#laps_per_race dictionary with num laps per race
for driver in driver_list:
for race in race_list:
driver_race_query = “SELECT * from nascar_xyz where driver=driver and Race=race”.format(driver=driver, race=race)
df_entire_race = client.query(driver_race_query).to_dataframe()
num_laps = laps_per_race[race]
for lap in num_laps:
#get subset of dataframe just for this lap
df = df_entire_race.loc[df_entire_race['Lap'] == lap]
df.sort_values(‘Epoch_time’, inplace=True)
df[‘prev_x’] = df[‘X’].shift(2)
df[‘next_x’] = df[‘X’].shift(-2)
df[‘prev_y’] = df[‘Y’].shift(2)
df[‘next_y’] = df[‘Y’].shift(-2)
#this is just distance function sqrt((x2-x1)^2 + (y2-y1)^2)
df['delta_dist'] = np.sqrt((df[‘X’].shift(-2) - df[‘X’].shift(2))**2 + (df[‘Y’].shift(-2) - df[‘Y’].shift(2))**2))

#400.0 is the time actual difference
df['Curr_speed'] = df['delta_dist']/400.0


I think in my sql query I either have to do a group by or partition by to because I want to look in each race by driver_id, then lap (if that level of abstraction makes sense). Maybe for the speed and looking capture_frames ahead I can do something with windowing (https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts) or something called lag which seems like the equivalent of .shift() in pandas.










share|improve this question



















  • 2





    Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

    – Felipe Hoffa
    Mar 9 at 1:11











  • it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

    – Mikhail Berlyant
    Mar 9 at 12:17

















3















I have a table in Bigquery with tracking data for Nascar drivers (dummy data for a project I am working on). The x and y coordinates are taken 10 times a second. The capture_frame signifies the current frame, and each sequential capture_frame should be 100 milliseconds apart because the data is taken every 100 ms.



I want to calculate each driver's speed per lap. I know how to do this in pandas but I think this is possible in bigquery. To calculate speed, I am looking at 2 rows before capture_frame and 2 rows after and then dividing by the difference in epoch time, which should be 400 milliseconds.



Here is an example of a few capture frames for 1 race for one driver for the first lap. There are a few hundred capture frames per lap and then 20 drivers mixed in as well, but it is easier to understand if we look at just one driver/race/lap.



+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| Race | Capture | Lap | Driver | … | X | Y | Epoch_time | Delta_dist | Curr_speed |
| | _frame | | | | | | | | |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 1 | 1 | Logano | …. | 2.1 | 1 | 1552089720 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 2 | 1 | Logano | … | 2.2 | 1.1 | 1552089820 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 3 | 1 | Logano | … | 2.22 | 1.2 | 1552089920 | 2.265921446 | 0.005664804 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 4 | 1 | Logano | .. | 3.22 | 1.5 | 1552090020 | 3.124163888 | 0.00781041 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 5 | 1 | Logano | .. | 4.22 | 1.8 | 1552090120 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 6 | 1 | Logano | .. | 5.22 | 1.9 | 1552090220 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+


The delta_dist for frame 3 is calculated by sqrt((4.22-2.1)^2 + (1.8-1)^2)/1 and the the curr_speed is that number divided by 400. The first/last 2 distances and speeds of the race will be null because there are no prior x or y coordinates which is okay as there isn't really any speed when you are .1 second from starting or stopping.



In pandas I would do (this is not great code as I am just bringing each driver and race in on its own):



#laps_per_race dictionary with num laps per race
for driver in driver_list:
for race in race_list:
driver_race_query = “SELECT * from nascar_xyz where driver=driver and Race=race”.format(driver=driver, race=race)
df_entire_race = client.query(driver_race_query).to_dataframe()
num_laps = laps_per_race[race]
for lap in num_laps:
#get subset of dataframe just for this lap
df = df_entire_race.loc[df_entire_race['Lap'] == lap]
df.sort_values(‘Epoch_time’, inplace=True)
df[‘prev_x’] = df[‘X’].shift(2)
df[‘next_x’] = df[‘X’].shift(-2)
df[‘prev_y’] = df[‘Y’].shift(2)
df[‘next_y’] = df[‘Y’].shift(-2)
#this is just distance function sqrt((x2-x1)^2 + (y2-y1)^2)
df['delta_dist'] = np.sqrt((df[‘X’].shift(-2) - df[‘X’].shift(2))**2 + (df[‘Y’].shift(-2) - df[‘Y’].shift(2))**2))

#400.0 is the time actual difference
df['Curr_speed'] = df['delta_dist']/400.0


I think in my sql query I either have to do a group by or partition by to because I want to look in each race by driver_id, then lap (if that level of abstraction makes sense). Maybe for the speed and looking capture_frames ahead I can do something with windowing (https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts) or something called lag which seems like the equivalent of .shift() in pandas.










share|improve this question



















  • 2





    Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

    – Felipe Hoffa
    Mar 9 at 1:11











  • it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

    – Mikhail Berlyant
    Mar 9 at 12:17













3












3








3


1






I have a table in Bigquery with tracking data for Nascar drivers (dummy data for a project I am working on). The x and y coordinates are taken 10 times a second. The capture_frame signifies the current frame, and each sequential capture_frame should be 100 milliseconds apart because the data is taken every 100 ms.



I want to calculate each driver's speed per lap. I know how to do this in pandas but I think this is possible in bigquery. To calculate speed, I am looking at 2 rows before capture_frame and 2 rows after and then dividing by the difference in epoch time, which should be 400 milliseconds.



Here is an example of a few capture frames for 1 race for one driver for the first lap. There are a few hundred capture frames per lap and then 20 drivers mixed in as well, but it is easier to understand if we look at just one driver/race/lap.



+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| Race | Capture | Lap | Driver | … | X | Y | Epoch_time | Delta_dist | Curr_speed |
| | _frame | | | | | | | | |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 1 | 1 | Logano | …. | 2.1 | 1 | 1552089720 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 2 | 1 | Logano | … | 2.2 | 1.1 | 1552089820 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 3 | 1 | Logano | … | 2.22 | 1.2 | 1552089920 | 2.265921446 | 0.005664804 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 4 | 1 | Logano | .. | 3.22 | 1.5 | 1552090020 | 3.124163888 | 0.00781041 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 5 | 1 | Logano | .. | 4.22 | 1.8 | 1552090120 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 6 | 1 | Logano | .. | 5.22 | 1.9 | 1552090220 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+


The delta_dist for frame 3 is calculated by sqrt((4.22-2.1)^2 + (1.8-1)^2)/1 and the the curr_speed is that number divided by 400. The first/last 2 distances and speeds of the race will be null because there are no prior x or y coordinates which is okay as there isn't really any speed when you are .1 second from starting or stopping.



In pandas I would do (this is not great code as I am just bringing each driver and race in on its own):



#laps_per_race dictionary with num laps per race
for driver in driver_list:
for race in race_list:
driver_race_query = “SELECT * from nascar_xyz where driver=driver and Race=race”.format(driver=driver, race=race)
df_entire_race = client.query(driver_race_query).to_dataframe()
num_laps = laps_per_race[race]
for lap in num_laps:
#get subset of dataframe just for this lap
df = df_entire_race.loc[df_entire_race['Lap'] == lap]
df.sort_values(‘Epoch_time’, inplace=True)
df[‘prev_x’] = df[‘X’].shift(2)
df[‘next_x’] = df[‘X’].shift(-2)
df[‘prev_y’] = df[‘Y’].shift(2)
df[‘next_y’] = df[‘Y’].shift(-2)
#this is just distance function sqrt((x2-x1)^2 + (y2-y1)^2)
df['delta_dist'] = np.sqrt((df[‘X’].shift(-2) - df[‘X’].shift(2))**2 + (df[‘Y’].shift(-2) - df[‘Y’].shift(2))**2))

#400.0 is the time actual difference
df['Curr_speed'] = df['delta_dist']/400.0


I think in my sql query I either have to do a group by or partition by to because I want to look in each race by driver_id, then lap (if that level of abstraction makes sense). Maybe for the speed and looking capture_frames ahead I can do something with windowing (https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts) or something called lag which seems like the equivalent of .shift() in pandas.










share|improve this question
















I have a table in Bigquery with tracking data for Nascar drivers (dummy data for a project I am working on). The x and y coordinates are taken 10 times a second. The capture_frame signifies the current frame, and each sequential capture_frame should be 100 milliseconds apart because the data is taken every 100 ms.



I want to calculate each driver's speed per lap. I know how to do this in pandas but I think this is possible in bigquery. To calculate speed, I am looking at 2 rows before capture_frame and 2 rows after and then dividing by the difference in epoch time, which should be 400 milliseconds.



Here is an example of a few capture frames for 1 race for one driver for the first lap. There are a few hundred capture frames per lap and then 20 drivers mixed in as well, but it is easier to understand if we look at just one driver/race/lap.



+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| Race | Capture | Lap | Driver | … | X | Y | Epoch_time | Delta_dist | Curr_speed |
| | _frame | | | | | | | | |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 1 | 1 | Logano | …. | 2.1 | 1 | 1552089720 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 2 | 1 | Logano | … | 2.2 | 1.1 | 1552089820 | NULL | Null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 3 | 1 | Logano | … | 2.22 | 1.2 | 1552089920 | 2.265921446 | 0.005664804 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 4 | 1 | Logano | .. | 3.22 | 1.5 | 1552090020 | 3.124163888 | 0.00781041 |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 5 | 1 | Logano | .. | 4.22 | 1.8 | 1552090120 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+
| I500 | 6 | 1 | Logano | .. | 5.22 | 1.9 | 1552090220 | NULL | null |
+------+---------+-----+--------+----+------+-----+------------+-------------+-------------+


The delta_dist for frame 3 is calculated by sqrt((4.22-2.1)^2 + (1.8-1)^2)/1 and the the curr_speed is that number divided by 400. The first/last 2 distances and speeds of the race will be null because there are no prior x or y coordinates which is okay as there isn't really any speed when you are .1 second from starting or stopping.



In pandas I would do (this is not great code as I am just bringing each driver and race in on its own):



#laps_per_race dictionary with num laps per race
for driver in driver_list:
for race in race_list:
driver_race_query = “SELECT * from nascar_xyz where driver=driver and Race=race”.format(driver=driver, race=race)
df_entire_race = client.query(driver_race_query).to_dataframe()
num_laps = laps_per_race[race]
for lap in num_laps:
#get subset of dataframe just for this lap
df = df_entire_race.loc[df_entire_race['Lap'] == lap]
df.sort_values(‘Epoch_time’, inplace=True)
df[‘prev_x’] = df[‘X’].shift(2)
df[‘next_x’] = df[‘X’].shift(-2)
df[‘prev_y’] = df[‘Y’].shift(2)
df[‘next_y’] = df[‘Y’].shift(-2)
#this is just distance function sqrt((x2-x1)^2 + (y2-y1)^2)
df['delta_dist'] = np.sqrt((df[‘X’].shift(-2) - df[‘X’].shift(2))**2 + (df[‘Y’].shift(-2) - df[‘Y’].shift(2))**2))

#400.0 is the time actual difference
df['Curr_speed'] = df['delta_dist']/400.0


I think in my sql query I either have to do a group by or partition by to because I want to look in each race by driver_id, then lap (if that level of abstraction makes sense). Maybe for the speed and looking capture_frames ahead I can do something with windowing (https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts) or something called lag which seems like the equivalent of .shift() in pandas.







python sql pandas google-bigquery geospatial






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 9 at 2:58









Pikachu the Purple Wizard

2,06161529




2,06161529










asked Mar 9 at 0:51









H. RoseH. Rose

191




191







  • 2





    Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

    – Felipe Hoffa
    Mar 9 at 1:11











  • it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

    – Mikhail Berlyant
    Mar 9 at 12:17












  • 2





    Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

    – Felipe Hoffa
    Mar 9 at 1:11











  • it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

    – Mikhail Berlyant
    Mar 9 at 12:17







2




2





Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

– Felipe Hoffa
Mar 9 at 1:11





Tangential comment: I'm surprised this question got 3 upvotes in less than 5 minutes.

– Felipe Hoffa
Mar 9 at 1:11













it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

– Mikhail Berlyant
Mar 9 at 12:17





it is not clear - what output you expect to have. can you provide example please, so we can help without speculating too much

– Mikhail Berlyant
Mar 9 at 12:17












1 Answer
1






active

oldest

votes


















1














You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



WITH data AS (
SELECT bus, ST_GeogPoint(longitude, latitude) point
, PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
FROM `fh-bigquery.mta_nyc_si.201410_bustime`
WHERE day=20141014
AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
)


SELECT *
FROM (
SELECT bus, ts, distance/time speed
FROM (
SELECT bus, ts
, ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
, TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
FROM data
)
WHERE time IS NOT null
)
WHERE speed < 500


enter image description here






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55072921%2fbigquery-to-get-speed-based-on-time-location-data-rows-above-below-current-row%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



    WITH data AS (
    SELECT bus, ST_GeogPoint(longitude, latitude) point
    , PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
    FROM `fh-bigquery.mta_nyc_si.201410_bustime`
    WHERE day=20141014
    AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
    )


    SELECT *
    FROM (
    SELECT bus, ts, distance/time speed
    FROM (
    SELECT bus, ts
    , ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
    , TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
    FROM data
    )
    WHERE time IS NOT null
    )
    WHERE speed < 500


    enter image description here






    share|improve this answer



























      1














      You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



      WITH data AS (
      SELECT bus, ST_GeogPoint(longitude, latitude) point
      , PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
      FROM `fh-bigquery.mta_nyc_si.201410_bustime`
      WHERE day=20141014
      AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
      )


      SELECT *
      FROM (
      SELECT bus, ts, distance/time speed
      FROM (
      SELECT bus, ts
      , ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
      , TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
      FROM data
      )
      WHERE time IS NOT null
      )
      WHERE speed < 500


      enter image description here






      share|improve this answer

























        1












        1








        1







        You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



        WITH data AS (
        SELECT bus, ST_GeogPoint(longitude, latitude) point
        , PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
        FROM `fh-bigquery.mta_nyc_si.201410_bustime`
        WHERE day=20141014
        AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
        )


        SELECT *
        FROM (
        SELECT bus, ts, distance/time speed
        FROM (
        SELECT bus, ts
        , ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
        , TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
        FROM data
        )
        WHERE time IS NOT null
        )
        WHERE speed < 500


        enter image description here






        share|improve this answer













        You are in the right path. I'll take a public dataset of buses moving around Staten Island - and I'll use the geographical distance by looking at their lat,lon:



        WITH data AS (
        SELECT bus, ST_GeogPoint(longitude, latitude) point
        , PARSE_TIMESTAMP('%Y%m%d %H%M%S',FORMAT('%i %06d', day, time)) ts
        FROM `fh-bigquery.mta_nyc_si.201410_bustime`
        WHERE day=20141014
        AND bus IN (7043, 7086, 7076, 2421, 7052, 7071)
        )


        SELECT *
        FROM (
        SELECT bus, ts, distance/time speed
        FROM (
        SELECT bus, ts
        , ST_DISTANCE(point, LAG(point, 3) OVER(PARTITION BY bus ORDER BY ts)) distance
        , TIMESTAMP_DIFF(ts, LAG(ts, 3) OVER(PARTITION BY bus ORDER BY ts), SECOND) time
        FROM data
        )
        WHERE time IS NOT null
        )
        WHERE speed < 500


        enter image description here







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 9 at 1:44









        Felipe HoffaFelipe Hoffa

        22.7k253122




        22.7k253122





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55072921%2fbigquery-to-get-speed-based-on-time-location-data-rows-above-below-current-row%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme

            Identity Server 4 is not redirecting to Angular app after login2019 Community Moderator ElectionIdentity Server 4 and dockerIdentityserver implicit flow unauthorized_clientIdentityServer Hybrid Flow - Access Token is null after user successful loginIdentity Server to MVC client : Page Redirect After loginLogin with Steam OpenId(oidc-client-js)Identity Server 4+.NET Core 2.0 + IdentityIdentityServer4 post-login redirect not working in Edge browserCall to IdentityServer4 generates System.NullReferenceException: Object reference not set to an instance of an objectIdentityServer4 without HTTPS not workingHow to get Authorization code from identity server without login form

            2005 Ahvaz unrest Contents Background Causes Casualties Aftermath See also References Navigation menue"At Least 10 Are Killed by Bombs in Iran""Iran"Archived"Arab-Iranians in Iran to make April 15 'Day of Fury'"State of Mind, State of Order: Reactions to Ethnic Unrest in the Islamic Republic of Iran.10.1111/j.1754-9469.2008.00028.x"Iran hangs Arab separatists"Iran Overview from ArchivedConstitution of the Islamic Republic of Iran"Tehran puzzled by forged 'riots' letter""Iran and its minorities: Down in the second class""Iran: Handling Of Ahvaz Unrest Could End With Televised Confessions""Bombings Rock Iran Ahead of Election""Five die in Iran ethnic clashes""Iran: Need for restraint as anniversary of unrest in Khuzestan approaches"Archived"Iranian Sunni protesters killed in clashes with security forces"Archived