Questions tagged [google-bigquery]
Google BigQuery is a Google Cloud Platform product providing serverless queries of petabyte-scale data sets using SQL. BigQuery provides multiple read-write pipelines, and enables data analytics that transform how businesses analyze data.
google-bigquery
26,322
questions
125
votes
8
answers
155k
views
Is there a way to export a BigQuery table's schema as JSON?
A BigQuery table has schema which can be viewed in the web UI, updated, or used to load data with the bq tool as a JSON file. However, I can't find a way to dump this schema from an existing table to ...
107
votes
3
answers
77k
views
What's the difference between BigQuery and Bigtable? [closed]
Is there any reason why someone would use Bigtable instead of BigQuery? Both seem to support Read and Write operations with the latter offering also advanced 'Query' operations.
I need to develop an ...
102
votes
1
answer
169k
views
Cannot access field in Big Query with type ARRAY<STRUCT<hitNumber INT64, time INT64, hour INT64, ...>>
I'm trying to run a query using Standard SQL Dialect (ie not Legacy SQL) on BigQuery. My query is:
SELECT
date, hits.referer
FROM `refresh.ga_sessions_xxxxxx*`
LIMIT 1000
But keep getting the error
...
94
votes
6
answers
105k
views
Random Sampling in Google BigQuery
I just discovered that the RAND() function, while undocumented, works in BigQuery. I was able to generate a (seemingly) random sample of 10 words from the Shakespeare dataset using:
SELECT word FROM
(...
82
votes
16
answers
304k
views
Setting GOOGLE_APPLICATION_CREDENTIALS for BigQuery Python CLI
I'm trying to connect to Google BigQuery through the BigQuery API, using Python.
I'm following this page here:
https://cloud.google.com/bigquery/bigquery-api-quickstart
My code is as follows:
import ...
78
votes
2
answers
88k
views
Update or Delete tables with streaming buffer in BigQuery?
I'm getting this following error when trying to delete records from a table created through GCP Console and updated with GCP BigQuery Node.js table insert function.
UPDATE or DELETE DML statements ...
77
votes
5
answers
146k
views
Select All Columns Except Some in Google BigQuery?
Is there a way to Select * except [x,y,z column names] in BigQuery? I see some solutions for MySQL but not sure if it applies to BQ.
Thank you.
76
votes
10
answers
137k
views
Delete duplicate rows from a BigQuery table
I have a table with >1M rows of data and 20+ columns.
Within my table (tableX) I have identified duplicate records (~80k) in one particular column (troubleColumn).
If possible I would like to retain ...
72
votes
8
answers
78k
views
How do I identify the Google Cloud Storage URI from my Google Developers Console?
When I attempt load data into BigQuery from Google Cloud Storage it asks for the Google Cloud Storage URI (gs://). I have reviewed all of your online support as well as stackoverflow and cannot find ...
67
votes
1
answer
37k
views
BigQuery - Datetime vs Timestamp
I looked on the documentation for google big query data types, checking the differences between TimeStamp to Datetime data types.
As I understand the main difference is:
Unlike Timestamps, a ...
64
votes
11
answers
200k
views
How to create temporary table in Google BigQuery
Is there any way to create a temporary table in Google BigQuery through:
SELECT * INTO <temp table>
FROM <table name>
same as we can create in SQL?
For complex queries, I need to ...
61
votes
6
answers
147k
views
Setting Big Query variables like mysql
what is the bigquery equivalent to mysql variables like?
SET @fromdate = '2014-01-01 00:00:00', -- dates for after 2013
@todate='2015-01-01 00:00:00',
@bfromdate = '2005-01-01 00:00:00', -- dates ...
57
votes
1
answer
59k
views
SQL array flattening: Why doesn't CROSS JOIN UNNEST join every nested value with every row?
This question isn't about solving a particular problem, it's about understanding what's actually happening behind the scenes in a common SQL idiom used to flatten arrays. There's some magic behind ...
53
votes
5
answers
109k
views
How to Auth to Google Cloud using Service Account in Python?
I'm trying to make a project that will upload Google Storage JSON file to BigQuery (just automate something that is done manually now), and I'd like to use a 'service account' for this as my script is ...
52
votes
3
answers
199k
views
Can we cast the type in BigQuery?
Following my query :
SELECT SQRT((D_o_latitude - T_s_lat)^2+(D_o_longitude - T_s_long)^2)/0.00001 FROM [datasetName.tableName]
I am getting the error as Error: Argument type mismatch in function ...
51
votes
4
answers
210k
views
STRING to DATE in BIGQUERY
I am struggling to try to do this with Google BigQuery:
I do have a column with dates in the following STRING format:
6/9/2017 (M/D/YYYY)
I am wondering how can I deal with this, trying to use the ...
51
votes
4
answers
76k
views
Google BigQuery There are no primary key or unique constraints, how do you prevent duplicated records being inserted? [closed]
Google BigQuery has no primary key or unique constraints. As such, we cannot use traditional SQL options such as insert ignore or insert on duplicate key update.
If I have to call delete first (based ...
49
votes
7
answers
194k
views
Google BigQuery Delete Rows?
Anyone know of any plans to add support for delete parts of data from a table in Google Bigquery? The issue we have right now is we are using it for analytics of data points we collect over time. We ...
47
votes
7
answers
219k
views
BigQuery converting to a different timezone
I am storing data in unixtimestamp on google big query. However, when the user will ask for a report, she will need the filtering and grouping of data by her local timezone.
The data is stored in ...
47
votes
3
answers
42k
views
How to exclude NULLs from ARRAY so query won't fail
ARRAY_AGG aggregate function includes NULLs in the arrays it builds. When such arrays are part of query result, query fails with error:
Array cannot have a null element; error in writing field
i....
46
votes
4
answers
42k
views
How can I undelete a BigQuery table?
I've accidentally deleted one of my BigQuery tables. Is it possible to get it back? The API doesn't seem to support undelete.
45
votes
6
answers
135k
views
Bigquery query to find the column names of a table
I need a query to find column names of a table (table metadata) in Bigquery, like the following query in SQL:
SELECT column_name,data_type,data_length,data_precision,nullable FROM all_tab_cols where ...
44
votes
4
answers
81k
views
Google BQ - how to upsert existing data in tables?
I'm using Python client library for loading data in BigQuery tables. I need to update some changed rows in those tables. But I couldn't figure out how to correctly update them? I want some similar ...
44
votes
3
answers
89k
views
Efficiently write a Pandas dataframe to Google BigQuery
I'm trying to upload a pandas.DataFrame to Google Big Query using the pandas.DataFrame.to_gbq() function documented here. The problem is that to_gbq() takes 2.3 minutes while uploading directly to ...
43
votes
8
answers
61k
views
Exporting data from Google Cloud Storage to Amazon S3
I would like to transfer data from a table in BigQuery, into another one in Redshift.
My planned data flow is as follows:
BigQuery -> Google Cloud Storage -> Amazon S3 -> Redshift
I know about ...
42
votes
3
answers
76k
views
difference in minutes between 2 bigquery timestamp fields
How can I get the difference in minutes between 2 timestamp fields in google bigquery?
The only function I know is Datediff which gives the difference in day
Thanks
39
votes
5
answers
57k
views
How to get BigQuery storage size for a single table
I want to calculate table wise cost for Google Big Query Storage, But i don't know how to view size of storage for each table individually.
38
votes
3
answers
99k
views
How can I extract date from epoch time in BigQuery SQL
I have date stored in Epoch Time and I want to extract
Date from it. I tried the code below and I get null as output.
date_add( (timestamp( Hp.ASSIGN_TIME)), 1970-01-01,"second" ) as ...
37
votes
3
answers
41k
views
What is Google's Dremel? How is it different from Mapreduce?
Google's Dremel is described here. What's the difference between Dremel and Mapreduce?
37
votes
7
answers
129k
views
How to Pivot table in BigQuery
I am using Google Big Query, and I am trying to get a pivoted result out from public sample data set.
A simple query to an existing table is:
SELECT *
FROM publicdata:samples.shakespeare
LIMIT 10;
...
37
votes
3
answers
126k
views
BigQuery SQL WHERE Date Between Current Date and -15 Days
I am trying to code the following condition in the WHERE clause of SQL in BigQuery, but I am having difficulty with the syntax, specifically date math:
WHERE date_column between current_date() and ...
37
votes
3
answers
19k
views
BigQuery Date-Partitioned Views
BigQuery allows you to create date-partitioned tables:
https://cloud.google.com/bigquery/docs/creating-partitioned-tables
I'd like to be able to create views on top of date-partitioned tables and ...
36
votes
3
answers
64k
views
efficient way to compare two tables in bigquery
I am interested in comparing, whether two tables contain the same data.
I could do it like this:
#standardSQL
SELECT
key1, key2
FROM
(
SELECT
table1.key1,
table1.key2,
table1....
36
votes
2
answers
51k
views
Is there a function to get the max of two values in Google BigQuery?
I want to get the maximum value of 2 Integer (or 2 float).
I know I can do it with a IF function like this:
IF (column1 > column2, column1, column2)
however I was wondering if a function to do ...
35
votes
6
answers
36k
views
BigQuery - Where can I find the error stream?
I have uploaded a CSV file with 300K rows from GCS to BigQuery, and received the following error:
Where can I find the error stream?
I've changed the create table configuration to allow 4000 errors ...
34
votes
6
answers
37k
views
How to convert a non-partitioned BigQuery table to partitioned?
In June the BQ team announced support for date-partitioned tables. But the guide is missing how to migrate old non-partitioned tables into the new style.
I am looking for a way to update several or ...
33
votes
7
answers
68k
views
What are the bigquery keyboard shortcuts?
Google's bigquery editor has keyboard shortcuts. For example ctrl+space composes a new query. I suspect there are more shortcuts, but I haven't found an useful list of them. Does anyone know them?
33
votes
3
answers
47k
views
how to rename a table without re creating it
I didn't find a RENAME option to alter table name.
I have a case that I must rename a table, and the only way is to select with result to new table. this query cost money, and taking long time for no ...
33
votes
7
answers
143k
views
Row number in BigQuery?
Is there any way to get row number for each record in BigQuery? (From the specs, I haven't seen anything about it) There is a NTH() function, but that applies to repeated fields.
There are some ...
33
votes
1
answer
29k
views
What does REPEATED field in Google Bigquery mean?
Please check my understanding of REPEATED field in the following examples:
{
"title": "History of Alphabet",
"author": [
{
"name": "Larry"
},
]
}
This JSON ...
32
votes
5
answers
196k
views
BigQuery: SPLIT() returns only one value
I have a page URL column components of which are delimited by /. I tried to run the SPLIT() function in BigQuery but it only gives the first value. I want all values in specific columns.
I don't ...
32
votes
4
answers
69k
views
BigQuery - remove unused column from schema
I accidentally added a wrong column to my BigQuery table schema.
Instead of reloading the complete table (million of rows), I would like to know if the following is possible:
remove bad rows (rows ...
32
votes
2
answers
31k
views
How to Manage Google API Errors in Python
I'm currently doing a lot of stuff with BigQuery, and am using a lot of try... except.... It looks like just about every error I get back from BigQuery is a apiclient.errors.HttpError, but with ...
32
votes
1
answer
24k
views
Google BigQuery, I lost null row when using 'unnest' function
#StandardSQL
WITH tableA AS (
SELECT ["T001", "T002", "T003"] AS T_id, [1, 5] AS L_id
UNION ALL
SELECT ["T008", "T009"] AS T_id, NULL AS L_id
)
SELECT * FROM tableA, UNNEST(L_id) AS unnest
When I ...
32
votes
9
answers
123k
views
Error: Not found: Dataset my-project-name:domain_public was not found in location US
I need to make a query for a dataset provided by a public project. I created my own project and added their dataset to my project. There is a table named: domain_public. When I make query to this ...
32
votes
2
answers
35k
views
percentile functions with GROUPBY in BigQuery
In my CENSUS table, I'd like to group by State, and for each State get the median county population and the number of counties.
In psql, redshift, and snowflake, I can do this:
psql=> SELECT ...
32
votes
2
answers
152k
views
BIGQUERY SELECT list expression references column CHANNEL_ID which is neither grouped nor aggregated at [10:13]
I am facing this error:
BIGQUERY SELECT list expression references column CHANNEL_ID which is
neither grouped nor aggregated at [10:13]
I don't know why it's caused,Can someone explain it to me?
...
31
votes
3
answers
36k
views
BigQuery: Aggregate multiple fields into array
I have some data where for each ID I want to aggregate two or more fields into an array, and I want them to match in terms of order.
So for example if I have the following data:
I want to turn it ...
31
votes
1
answer
9k
views
How to access “Saved Queries” programmatically?
In BigQuery Web UI, there is a “Saved Queries” Section.
Is there way to access (read/write) those programmatically?
Any API?
30
votes
4
answers
67k
views
Truncate a table in GBQ
I am trying to truncate an existing table in GBQ but the below command fails when I run it. Is there any specific command or syntax to do that. I looked into GBQ documentation but no luck.
TRUNCATE ...