Class GoogleSheetsUtils
java.lang.Object
org.apache.drill.exec.store.googlesheets.utils.GoogleSheetsUtils
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumRepresents the possible data types found in a GoogleSheets document -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic voidaddTabToGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String sheetName, String tabName) Adds a new tab to an existing GoogleSheet document.static voidappendDataToGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName, List<List<Object>> data) Accepts a list of data and writes this data to a GoogleSheet document.static com.google.api.client.auth.oauth2.Credentialauthorize(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) Creates an authorizedCredentialfor use in GoogleSheets queries.static TupleMetadatabuildSchema(Map<String, GoogleSheetsColumn> columnMap) Builds a Drill Schema from a Map of GoogleSheetsColumns.static StringcolumnToLetter(int column) Converts a column index to A1 notation.getBatchData(com.google.api.services.sheets.v4.Sheets service, String sheetID, List<String> ranges) This function is used to get data when projection is pushed down to Google Sheets.static Map<String, GoogleSheetsColumn> getColumnMap(List<List<Object>> sampleData, List<SchemaPath> projectedColumns, boolean allTextMode) getDataFromRange(com.google.api.services.sheets.v4.Sheets service, String sheetID, String range) Returns a 2D table of Objects representing the given range in A1 notation.static com.google.api.services.drive.DrivegetDriveService(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) Returns an authenticatedDriveservice.getFirstRows(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName) This function will be used to build the schema for Drill.static List<GoogleSheetsColumnRange> getProjectedRanges(String sheetName, Map<String, GoogleSheetsColumn> columnMap) static com.google.api.services.sheets.v4.model.SheetgetSheetFromTabList(String tabName, List<com.google.api.services.sheets.v4.model.Sheet> tabList) Finds aSheetfrom a list of tabs with a given title.static com.google.api.services.sheets.v4.SheetsgetSheetsService(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) static List<com.google.api.services.sheets.v4.model.Sheet> getTabList(com.google.api.services.sheets.v4.Sheets service, String sheetID) Returns a list of the titles of the available spreadsheets within a given Google sheet.getTokenToNameMap(com.google.api.services.drive.Drive driveService) In GoogleSheets, the file is uniquely identified by a non-human readable token.static GoogleSheetsUtils.DATA_TYPESinferDataType(String data) Infers the datatype of an unknown string.static booleanGoogle Sheets tokens are strings of length 44 that contain upper and lower case letters, numbers and underscores.static booleanisProjected(List<SchemaPath> projectedColumns, String columnName) Returns true if the column is projected, false if not.static intletterToColumnIndex(String letter) Given a column reference in A1 notation, this function will return the column numeric index.static com.google.api.client.auth.oauth2.CredentialloadCredential(String userId, com.google.api.client.googleapis.auth.oauth2.GoogleAuthorizationCodeFlow flow, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> credentialDataStore) static voidremoveTabFromGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String fileToken, com.google.api.services.sheets.v4.model.Sheet deletedTab) Removes a sheet from an existing GoogleSheets document.static voidupdateDataType(String columnName, Map<String, GoogleSheetsUtils.DATA_TYPES> dataTypesMap, String value) static voidwriteDataToGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName, List<List<Object>> data) Accepts a list of data and writes this data to a GoogleSheet document.
-
Constructor Details
-
GoogleSheetsUtils
public GoogleSheetsUtils()
-
-
Method Details
-
authorize
public static com.google.api.client.auth.oauth2.Credential authorize(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) throws IOException, GeneralSecurityException Creates an authorizedCredentialfor use in GoogleSheets queries.- Parameters:
config- TheGoogleSheetsStoragePluginConfigto be authorizeddataStore- ADrillDataStorecontaining the user's tokensqueryUser- The current query user's ID. This should be set to anonymous if user translation is disabled.- Returns:
- A validated
Credentialobject. - Throws:
IOException- If anything goes wrongGeneralSecurityException- If the credentials are invalid
-
loadCredential
public static com.google.api.client.auth.oauth2.Credential loadCredential(String userId, com.google.api.client.googleapis.auth.oauth2.GoogleAuthorizationCodeFlow flow, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> credentialDataStore) -
getSheetsService
public static com.google.api.services.sheets.v4.Sheets getSheetsService(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) throws IOException, GeneralSecurityException - Throws:
IOExceptionGeneralSecurityException
-
getDriveService
public static com.google.api.services.drive.Drive getDriveService(GoogleSheetsStoragePluginConfig config, com.google.api.client.util.store.DataStore<com.google.api.client.auth.oauth2.StoredCredential> dataStore, String queryUser) throws IOException, GeneralSecurityException Returns an authenticatedDriveservice.- Parameters:
config- TheGoogleSheetsStoragePluginConfigfor the plugindataStore- ADrillDataStorefor the stored credentialsqueryUser- AStringcontaining the current query user.- Returns:
- An authenticated
Driveservice. - Throws:
IOException- If anything goes wrong throw an IOExceptionGeneralSecurityException- If the creds are invalid or
-
getTokenToNameMap
public static Map<String,String> getTokenToNameMap(com.google.api.services.drive.Drive driveService) throws IOException In GoogleSheets, the file is uniquely identified by a non-human readable token. The Sheets SDK does not provide a means to map tokens to file names. To do this, we need to use the Google Drive SDK. Google Drive's concept of folders is more similar to that of S3 which doesn't really have folders or directories. So the user is not able to include any file paths in the query string. More importantly however, is that Google Drive allows duplicate file names, even within the same directory. Thus, it is entirely possible to have an entire directory of files with the same name. The tokens for these files will be different, but the human-readable names can be the same. This creates an obvious problem for Drill as we need unique file names to include in a query.- Parameters:
driveService- An authenticatedDriveservice.- Returns:
- A
HashMapcontaining the tokens as keys and the file names as values. - Throws:
IOException- If anything goes wrong, throw an IOException.
-
isProbableFileToken
Google Sheets tokens are strings of length 44 that contain upper and lower case letters, numbers and underscores. This function will attempt to identify file tokens.Given that Google's spec for file IDs is not officially published, and can change at any time, we will keep the validation as light as possible to prevent future issues, in the event Google changes their file Id structure.
- Parameters:
id- AStringcontaining an unknown identifier- Returns:
- True if the string is a file probable file token, false if not.
-
getTabList
public static List<com.google.api.services.sheets.v4.model.Sheet> getTabList(com.google.api.services.sheets.v4.Sheets service, String sheetID) throws IOException Returns a list of the titles of the available spreadsheets within a given Google sheet.- Parameters:
service- The Google Sheets servicesheetID- The sheetID for the Google sheet. This can be obtained from the URL of your Google sheet- Returns:
- A list of spreadsheet names within a given Google Sheet
- Throws:
IOException- If the Google sheet is unreachable or invalid.
-
columnToLetter
Converts a column index to A1 notation. Google sheets has a limitation of approx 18k columns, but that is not enforced here. The column index must be greater than zero or the function will return null. References code found here: Stack Overflow Article- Parameters:
column- The column index for the desired column. Must be greater than zero- Returns:
- The A1 representation of the column index.
-
letterToColumnIndex
Given a column reference in A1 notation, this function will return the column numeric index. GoogleSheets has a limit of approx 18k columns, but that is not enforced here. References code found here: Stack Overflow Article- Parameters:
letter- The desired column in A1 notation- Returns:
- The index of the supplied column
-
getFirstRows
public static List<List<Object>> getFirstRows(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName) throws IOException This function will be used to build the schema for Drill. As Google sheets does- Parameters:
service- An authenticated Google Sheets ServicesheetID- The Sheet ID for the Google Sheet (Can be found in the Sheet URL)tabName- The tab name of the actual spreadsheet you want to query- Returns:
- A nested list of the first five rows of the dataset.
- Throws:
IOException- If the request fails, throw an IOException
-
getDataFromRange
public static List<List<Object>> getDataFromRange(com.google.api.services.sheets.v4.Sheets service, String sheetID, String range) throws IOException Returns a 2D table of Objects representing the given range in A1 notation. Note that this function cannot be used with multiple ranges. If you are trying to retrieve multiple groups of columns, you must use the getBatchData function.- Parameters:
service- The Authenticated GoogleSheets servicesheetID- The GoogleSheet ID. This can be found in the Sheet URLrange- The range in A1 notation.- Returns:
- A 2D table of Objects representing the given range.
- Throws:
IOException- If the request fails, throw an IOException.
-
getSheetFromTabList
public static com.google.api.services.sheets.v4.model.Sheet getSheetFromTabList(String tabName, List<com.google.api.services.sheets.v4.model.Sheet> tabList) Finds aSheetfrom a list of tabs with a given title. If the sheet is not present, the function will throw a User Exception.- Parameters:
tabName- The name of the desired sheet.tabList- AListofSheetobjects- Returns:
- The desired Sheet.
-
getBatchData
public static List<List<Object>> getBatchData(com.google.api.services.sheets.v4.Sheets service, String sheetID, List<String> ranges) throws IOException This function is used to get data when projection is pushed down to Google Sheets.- Parameters:
service- The Authenticated GoogleSheets servicesheetID- The GoogleSheet ID. This can be found in the Sheet URLranges- The list of ranges- Throws:
IOException- If anything goes wrong, IOException will be thrown
-
getColumnMap
public static Map<String,GoogleSheetsColumn> getColumnMap(List<List<Object>> sampleData, List<SchemaPath> projectedColumns, boolean allTextMode) - Parameters:
sampleData- This represents a sample of the first few rows of data which will be used to build the schema.projectedColumns- A list of projected columnsallTextMode- If true, the columns will all be of the VARCHAR type- Returns:
- A map of the column name and
GoogleSheetsColumncolumn for every projected column.
-
getProjectedRanges
public static List<GoogleSheetsColumnRange> getProjectedRanges(String sheetName, Map<String, GoogleSheetsColumn> columnMap) -
isProjected
Returns true if the column is projected, false if not.- Parameters:
projectedColumns- A list of projected columns AKA the haystack.columnName- The column name AKA the needle- Returns:
- True if the needle is in the haystack, false if not.
-
buildSchema
Builds a Drill Schema from a Map of GoogleSheetsColumns.- Parameters:
columnMap- A map ofGoogleSheetsColumncontaining the schema info.- Returns:
- A populated
TupleMetadataschema
-
inferDataType
Infers the datatype of an unknown string.- Parameters:
data- An input String of unknown type.- Returns:
- The
GoogleSheetsUtils.DATA_TYPESof the unknown string.
-
updateDataType
public static void updateDataType(String columnName, Map<String, GoogleSheetsUtils.DATA_TYPES> dataTypesMap, String value) -
addTabToGoogleSheet
public static void addTabToGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String sheetName, String tabName) throws IOException Adds a new tab to an existing GoogleSheet document.- Parameters:
service- An authenticated GoogleSheet servicesheetName- The GoogleSheet name of the documenttabName- The name of the tab you wish to add to the GoogleSheet document- Throws:
IOException- Throws an IOException if anything goes wrong.
-
removeTabFromGoogleSheet
public static void removeTabFromGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String fileToken, com.google.api.services.sheets.v4.model.Sheet deletedTab) throws IOException Removes a sheet from an existing GoogleSheets document. This method should only be used if the GoogleSheets document has more than one tab.- Parameters:
service- An authenticated GoogleSheetSheetsfileToken- The File token of the GoogleSheet containing the sheet to be deleteddeletedTab- ASheetwhich will be removed- Throws:
IOException- If anything goes wrong.
-
writeDataToGoogleSheet
public static void writeDataToGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName, List<List<Object>> data) throws IOException Accepts a list of data and writes this data to a GoogleSheet document.- Parameters:
service- An authenticated GoogleSheet servicesheetID- The SheetID. This can be obtained from the URL of the GoogleSheet DocumenttabName- The tab name within the aforementioned GoogleSheetdata- A list of rows of the data to be inserted.- Throws:
IOException- If anything goes wrong, throw an IO exception
-
appendDataToGoogleSheet
public static void appendDataToGoogleSheet(com.google.api.services.sheets.v4.Sheets service, String sheetID, String tabName, List<List<Object>> data) throws IOException Accepts a list of data and writes this data to a GoogleSheet document.- Parameters:
service- An authenticated GoogleSheet servicesheetID- The SheetID. This can be obtained from the URL of the GoogleSheet DocumenttabName- The tab name within the aforementioned GoogleSheetdata- A list of rows of the data to be inserted.- Throws:
IOException- If anything goes wrong, throw an IO exception
-