Skip navigation links

Package org.apache.drill.exec.store.easy.json.extended

Provides parsing for Mongo extended types which are generally of the form { "$type": value }.

See: Description

Package org.apache.drill.exec.store.easy.json.extended Description

Provides parsing for Mongo extended types which are generally of the form { "$type": value }. Supports both V1 and V2 names. Supports both the Canonical and Relaxed formats.

Does not support all types as some appear internal to Mongo. Supported types:

Unsupported types:

The unsupported types appear more for commands and queries rather than data. They do not represent a Drill type. If they appear in data, they will be translated to a Drill map.

Drill defines a few "extended extended" types:

Drill extends the extended types to allow null values in the usual way. Drill accepts normal "un-extended" JSON in the same file, but doing so can lead to ambiguities (see below.)

Once Drill defines a field as an extended type, parsing rules are tighter than for normal "non-extended" types. For example an extended double will not convert from a Boolean or float value.

Provided Schema

If used with a provided schema, then:

Ambiguities

Extended JSON is subject to the same ambiguities as normal JSON. If Drill sees a field in relaxed mode before extended mode, Drill will use its normal type inference rules. Thus, if the first field presents as a: "30", Drill will infer the type as string, even if a later field presents as a: { "numberInt": 30 }. To avoid ambiguities, either use only the canonical format, or use a provided schema.

Implementation

Extended types disabled by default and must be enabled using the store.json.extended_types system/session option ( ExecConstants.JSON_EXTENDED_TYPES_KEY).

Extended types are implemented via a field factory. The field factory builds the structure needed each time the JSON structure parser sees a new field. For extended types, the field factory looks ahead to detect an extended type, specifically for the pattern { "$type":. If the pattern is found, and the name is one of the supported type names, then the factory creates a parser to accept the enhanced type in either the canonical or relaxed forms.

Each field is represented by a Mongo-specific parser along with an associated value listener. The implementation does not reify the object structure; that structure is consumed by the field parser itself. The value listener receives value tokens as if the data were in relaxed format.

See Also:
MapVectorOutput for an older implementation
Skip navigation links

Copyright © 1970 The Apache Software Foundation. All rights reserved.