Class GoogleSheetsUtils
java.lang.Object
org.apache.drill.exec.store.googlesheets.utils.GoogleSheetsUtils
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
Represents the possible data types found in a GoogleSheets document -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
addTabToGoogleSheet
(com.google.api.services.sheets.v4.Sheets service, String sheetName, String tabName) Adds a new tab to an existing GoogleSheet document.static void
appendDataToGoogleSheet
(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName, List<List<Object>> data) Accepts a list of data and writes this data to a GoogleSheet document.static com.google.api.client.auth.oauth2.Credential
authorize
(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) Creates an authorizedCredential
for use in GoogleSheets queries.static TupleMetadata
buildSchema
(Map<String, GoogleSheetsColumn> columnMap) Builds a Drill Schema from a Map of GoogleSheetsColumns.static String
columnToLetter
(int column) Converts a column index to A1 notation.getBatchData
(com.google.api.services.sheets.v4.Sheets service, String sheetID, List<String> ranges) This function is used to get data when projection is pushed down to Google Sheets.static Map<String,
GoogleSheetsColumn> getColumnMap
(List<List<Object>> sampleData, List<SchemaPath> projectedColumns, boolean allTextMode) getDataFromRange
(com.google.api.services.sheets.v4.Sheets service, String sheetID, String range) Returns a 2D table of Objects representing the given range in A1 notation.static com.google.api.services.drive.Drive
getDriveService
(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) Returns an authenticatedDrive
service.getFirstRows
(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName) This function will be used to build the schema for Drill.static List<GoogleSheetsColumnRange>
getProjectedRanges
(String sheetName, Map<String, GoogleSheetsColumn> columnMap) static com.google.api.services.sheets.v4.model.Sheet
getSheetFromTabList
(String tabName, List<com.google.api.services.sheets.v4.model.Sheet> tabList) Finds aSheet
from a list of tabs with a given title.static com.google.api.services.sheets.v4.Sheets
getSheetsService
(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) static List<com.google.api.services.sheets.v4.model.Sheet>
getTabList
(com.google.api.services.sheets.v4.Sheets service, String sheetID) Returns a list of the titles of the available spreadsheets within a given Google sheet.getTokenToNameMap
(com.google.api.services.drive.Drive driveService) In GoogleSheets, the file is uniquely identified by a non-human readable token.static GoogleSheetsUtils.DATA_TYPES
inferDataType
(String data) Infers the datatype of an unknown string.static boolean
Google Sheets tokens are strings of length 44 that contain upper and lower case letters, numbers and underscores.static boolean
isProjected
(List<SchemaPath> projectedColumns, String columnName) Returns true if the column is projected, false if not.static int
letterToColumnIndex
(String letter) Given a column reference in A1 notation, this function will return the column numeric index.static com.google.api.client.auth.oauth2.Credential
loadCredential
(String userId, com.google.api.client.googleapis.auth.oauth2.GoogleAuthorizationCodeFlow flow, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> credentialDataStore) static void
removeTabFromGoogleSheet
(com.google.api.services.sheets.v4.Sheets service, String fileToken, com.google.api.services.sheets.v4.model.Sheet deletedTab) Removes a sheet from an existing GoogleSheets document.static void
updateDataType
(String columnName, Map<String, GoogleSheetsUtils.DATA_TYPES> dataTypesMap, String value) static void
writeDataToGoogleSheet
(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName, List<List<Object>> data) Accepts a list of data and writes this data to a GoogleSheet document.
-
Constructor Details
-
GoogleSheetsUtils
public GoogleSheetsUtils()
-
-
Method Details
-
authorize
public static com.google.api.client.auth.oauth2.Credential authorize(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) throws IOException, GeneralSecurityException Creates an authorizedCredential
for use in GoogleSheets queries.- Parameters:
config
- TheGoogleSheetsStoragePluginConfig
to be authorizeddataStore
- ADrillDataStore
containing the user's tokensqueryUser
- The current query user's ID. This should be set to anonymous if user translation is disabled.- Returns:
- A validated
Credential
object. - Throws:
IOException
- If anything goes wrongGeneralSecurityException
- If the credentials are invalid
-
loadCredential
public static com.google.api.client.auth.oauth2.Credential loadCredential(String userId, com.google.api.client.googleapis.auth.oauth2.GoogleAuthorizationCodeFlow flow, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> credentialDataStore) -
getSheetsService
public static com.google.api.services.sheets.v4.Sheets getSheetsService(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) throws IOException, GeneralSecurityException - Throws:
IOException
GeneralSecurityException
-
getDriveService
public static com.google.api.services.drive.Drive getDriveService(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) throws IOException, GeneralSecurityException Returns an authenticatedDrive
service.- Parameters:
config
- TheGoogleSheetsStoragePluginConfig
for the plugindataStore
- ADrillDataStore
for the stored credentialsqueryUser
- AString
containing the current query user.- Returns:
- An authenticated
Drive
service. - Throws:
IOException
- If anything goes wrong throw an IOExceptionGeneralSecurityException
- If the creds are invalid or
-
getTokenToNameMap
public static Map<String,String> getTokenToNameMap(com.google.api.services.drive.Drive driveService) throws IOException In GoogleSheets, the file is uniquely identified by a non-human readable token. The Sheets SDK does not provide a means to map tokens to file names. To do this, we need to use the Google Drive SDK. Google Drive's concept of folders is more similar to that of S3 which doesn't really have folders or directories. So the user is not able to include any file paths in the query string. More importantly however, is that Google Drive allows duplicate file names, even within the same directory. Thus, it is entirely possible to have an entire directory of files with the same name. The tokens for these files will be different, but the human-readable names can be the same. This creates an obvious problem for Drill as we need unique file names to include in a query.- Parameters:
driveService
- An authenticatedDrive
service.- Returns:
- A
HashMap
containing the tokens as keys and the file names as values. - Throws:
IOException
- If anything goes wrong, throw an IOException.
-
isProbableFileToken
Google Sheets tokens are strings of length 44 that contain upper and lower case letters, numbers and underscores. This function will attempt to identify file tokens.Given that Google's spec for file IDs is not officially published, and can change at any time, we will keep the validation as light as possible to prevent future issues, in the event Google changes their file Id structure.
- Parameters:
id
- AString
containing an unknown identifier- Returns:
- True if the string is a file probable file token, false if not.
-
getTabList
public static List<com.google.api.services.sheets.v4.model.Sheet> getTabList(com.google.api.services.sheets.v4.Sheets service, String sheetID) throws IOException Returns a list of the titles of the available spreadsheets within a given Google sheet.- Parameters:
service
- The Google Sheets servicesheetID
- The sheetID for the Google sheet. This can be obtained from the URL of your Google sheet- Returns:
- A list of spreadsheet names within a given Google Sheet
- Throws:
IOException
- If the Google sheet is unreachable or invalid.
-
columnToLetter
Converts a column index to A1 notation. Google sheets has a limitation of approx 18k columns, but that is not enforced here. The column index must be greater than zero or the function will return null. References code found here: Stack Overflow Article- Parameters:
column
- The column index for the desired column. Must be greater than zero- Returns:
- The A1 representation of the column index.
-
letterToColumnIndex
Given a column reference in A1 notation, this function will return the column numeric index. GoogleSheets has a limit of approx 18k columns, but that is not enforced here. References code found here: Stack Overflow Article- Parameters:
letter
- The desired column in A1 notation- Returns:
- The index of the supplied column
-
getFirstRows
public static List<List<Object>> getFirstRows(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName) throws IOException This function will be used to build the schema for Drill. As Google sheets does- Parameters:
service
- An authenticated Google Sheets ServicesheetID
- The Sheet ID for the Google Sheet (Can be found in the Sheet URL)tabName
- The tab name of the actual spreadsheet you want to query- Returns:
- A nested list of the first five rows of the dataset.
- Throws:
IOException
- If the request fails, throw an IOException
-
getDataFromRange
public static List<List<Object>> getDataFromRange(com.google.api.services.sheets.v4.Sheets service, String sheetID, String range) throws IOException Returns a 2D table of Objects representing the given range in A1 notation. Note that this function cannot be used with multiple ranges. If you are trying to retrieve multiple groups of columns, you must use the getBatchData function.- Parameters:
service
- The Authenticated GoogleSheets servicesheetID
- The GoogleSheet ID. This can be found in the Sheet URLrange
- The range in A1 notation.- Returns:
- A 2D table of Objects representing the given range.
- Throws:
IOException
- If the request fails, throw an IOException.
-
getSheetFromTabList
public static com.google.api.services.sheets.v4.model.Sheet getSheetFromTabList(String tabName, List<com.google.api.services.sheets.v4.model.Sheet> tabList) Finds aSheet
from a list of tabs with a given title. If the sheet is not present, the function will throw a User Exception.- Parameters:
tabName
- The name of the desired sheet.tabList
- AList
ofSheet
objects- Returns:
- The desired Sheet.
-
getBatchData
public static List<List<Object>> getBatchData(com.google.api.services.sheets.v4.Sheets service, String sheetID, List<String> ranges) throws IOException This function is used to get data when projection is pushed down to Google Sheets.- Parameters:
service
- The Authenticated GoogleSheets servicesheetID
- The GoogleSheet ID. This can be found in the Sheet URLranges
- The list of ranges- Throws:
IOException
- If anything goes wrong, IOException will be thrown
-
getColumnMap
public static Map<String,GoogleSheetsColumn> getColumnMap(List<List<Object>> sampleData, List<SchemaPath> projectedColumns, boolean allTextMode) - Parameters:
sampleData
- This represents a sample of the first few rows of data which will be used to build the schema.projectedColumns
- A list of projected columnsallTextMode
- If true, the columns will all be of the VARCHAR type- Returns:
- A map of the column name and
GoogleSheetsColumn
column for every projected column.
-
getProjectedRanges
public static List<GoogleSheetsColumnRange> getProjectedRanges(String sheetName, Map<String, GoogleSheetsColumn> columnMap) -
isProjected
Returns true if the column is projected, false if not.- Parameters:
projectedColumns
- A list of projected columns AKA the haystack.columnName
- The column name AKA the needle- Returns:
- True if the needle is in the haystack, false if not.
-
buildSchema
Builds a Drill Schema from a Map of GoogleSheetsColumns.- Parameters:
columnMap
- A map ofGoogleSheetsColumn
containing the schema info.- Returns:
- A populated
TupleMetadata
schema
-
inferDataType
Infers the datatype of an unknown string.- Parameters:
data
- An input String of unknown type.- Returns:
- The
GoogleSheetsUtils.DATA_TYPES
of the unknown string.
-
updateDataType
public static void updateDataType(String columnName, Map<String, GoogleSheetsUtils.DATA_TYPES> dataTypesMap, String value) -
addTabToGoogleSheet
public static void addTabToGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String sheetName, String tabName) throws IOException Adds a new tab to an existing GoogleSheet document.- Parameters:
service
- An authenticated GoogleSheet servicesheetName
- The GoogleSheet name of the documenttabName
- The name of the tab you wish to add to the GoogleSheet document- Throws:
IOException
- Throws an IOException if anything goes wrong.
-
removeTabFromGoogleSheet
public static void removeTabFromGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String fileToken, com.google.api.services.sheets.v4.model.Sheet deletedTab) throws IOException Removes a sheet from an existing GoogleSheets document. This method should only be used if the GoogleSheets document has more than one tab.- Parameters:
service
- An authenticated GoogleSheetSheets
fileToken
- The File token of the GoogleSheet containing the sheet to be deleteddeletedTab
- ASheet
which will be removed- Throws:
IOException
- If anything goes wrong.
-
writeDataToGoogleSheet
public static void writeDataToGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName, List<List<Object>> data) throws IOException Accepts a list of data and writes this data to a GoogleSheet document.- Parameters:
service
- An authenticated GoogleSheet servicesheetID
- The SheetID. This can be obtained from the URL of the GoogleSheet DocumenttabName
- The tab name within the aforementioned GoogleSheetdata
- A list of rows of the data to be inserted.- Throws:
IOException
- If anything goes wrong, throw an IO exception
-
appendDataToGoogleSheet
public static void appendDataToGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName, List<List<Object>> data) throws IOException Accepts a list of data and writes this data to a GoogleSheet document.- Parameters:
service
- An authenticated GoogleSheet servicesheetID
- The SheetID. This can be obtained from the URL of the GoogleSheet DocumenttabName
- The tab name within the aforementioned GoogleSheetdata
- A list of rows of the data to be inserted.- Throws:
IOException
- If anything goes wrong, throw an IO exception
-