Build #339

Build: #339 was successful Changes by Edwin Elia <edwinelia@google.com>

Code commits

Wrangler Transform

  • Edwin Elia <edwinelia@google.com>

    Edwin Elia <edwinelia@google.com> bd9af004bdae27eb9d0d7e43b41456fe2dd6e7f7

    Merge pull request #415 from data-integrations/release/4.2
    Release/4.2

  • albertshau <ashau@google.com>

    albertshau <ashau@google.com> d2ac99dde4657444fc163dc0121c18622a5129a9 m

    Merge pull request #410 from data-integrations/bugfix_release/CDAP-16724-fix-bad-sampling
    CDAP-16724 fix bad sampling in GCS handler

  • albertshau <ashau@google.com>

    albertshau <ashau@google.com> 47810cb829c8a0a31191b5100d06aa7a82becab7 m

    CDAP-16724 fix bad sampling in GCS handler
    Fixed the GCS handler to create less duplicate data. Instead of
    copying data multiple times, try to go through the file contents
    once and just maintain a single list of Rows.

    Also limited the number of rows in the sample to 5000. This is
    because the Row class uses a lot of memory, so the handler was
    going out of memory reading just 10mb of an input file, where that
    10mb contained a million rows.

    Also fixed a bug in Row that made one of the constructors useless
    and changed another constructor to set the initial the array size
    to reduce memory usage.

    • wrangler-api/src/main/java/io/cdap/wrangler/api/Row.java (version 47810cb829c8a0a31191b5100d06aa7a82becab7)
    • wrangler-service/src/main/java/io/cdap/wrangler/service/gcs/GCSHandler.java (version 47810cb829c8a0a31191b5100d06aa7a82becab7)
  • albertshau <ashau@google.com>

    albertshau <ashau@google.com> a46291d30b42833048c1c7e1b8c1a5a418720d4c m

    Merge pull request #408 from data-integrations/bugfix_release/CDAP-16758-fix-schema-tests
    CDAP-16758 fix schema related unit tests

  • albertshau <ashau@google.com>

    albertshau <ashau@google.com> c0ea896b4b590be5d1b111e57bf59f447e15606e m

    CDAP-16758 fix schema related unit tests
    use correct schema name to fix unit tests

    • wrangler-core/src/test/java/io/cdap/wrangler/utils/Json2SchemaTest.java (version c0ea896b4b590be5d1b111e57bf59f447e15606e)
  • Venudhar Ravishankar <venu.ravishankar@gmail.com>

    Venudhar Ravishankar <venu.ravishankar@gmail.com> 80a485212df3aaeff50817ac73baeaa21245dfc3 m

    Merge pull request #398 from data-integrations/bugfix_release/CDAP-16633-drive-and-bq-scope-for-bq-requests
    [CDAP-16633] Added option to generate scoped GoogleCredentials with BQ and Drive scope for all BQ requests

  • Venudhar Ravishankar

    Venudhar Ravishankar 774e53275fcc2921b62ff7647aa8fd48a85d67d0 m

    [CDAP-16633] Added option to generate scoped GoogleCredentials with BQ and Drive scope for all BQ requests

    • wrangler-service/src/main/java/io/cdap/wrangler/service/gcp/GCPUtils.java (version 774e53275fcc2921b62ff7647aa8fd48a85d67d0)