albertshau <ashau@google.com>: Author Summary

Builds triggered by albertshau <ashau@google.com>

Builds triggered by an author are those builds which contains changes committed by the author.
937
284 (30%)
653 (70%)

Breakages and fixes

Broken means the build has failed but the previous build was successful.
Fixed means that the build was successful but the previous build has failed.
83 (9% of all builds triggered)
72 (8% of all builds triggered)
-11
Build Completed Code commits Tests
IT › UPD2 › #296 1 day ago
log warnings in one message to ensure they are always together
CDAP-15599 log dataproc creation warnings
Log warnings that are encountered when create dataproc clusters.
Also change the default disk sizes so that they don't generate
a warning about disk performance.

Also add some logic when a delete requests to check if the cluster
state is already in the deleting state. This is done to avoid
logging a red herring in a specific edge case.
Merge pull request #11544 from cdapio/bugfix_release/CDAP-15599-log-dataproc-warnings
Bugfix release/cdap 15599 log dataproc warnings
Testless build
CDAP › UDUT › #738 2 days ago
Merge pull request #11544 from cdapio/bugfix_release/CDAP-15599-log-dataproc-warnings
Bugfix release/cdap 15599 log dataproc warnings
CDAP-15599 log dataproc creation warnings
Log warnings that are encountered when create dataproc clusters.
Also change the default disk sizes so that they don't generate
a warning about disk performance.

Also add some logic when a delete requests to check if the cluster
state is already in the deleting state. This is done to avoid
logging a red herring in a specific edge case.
log warnings in one message to ensure they are always together
Testless build
CDAP › URUT › #747 2 days ago
log warnings in one message to ensure they are always together
CDAP-15599 log dataproc creation warnings
Log warnings that are encountered when create dataproc clusters.
Also change the default disk sizes so that they don't generate
a warning about disk performance.

Also add some logic when a delete requests to check if the cluster
state is already in the deleting state. This is done to avoid
logging a red herring in a specific edge case.
Merge pull request #11544 from cdapio/bugfix_release/CDAP-15599-log-dataproc-warnings
Bugfix release/cdap 15599 log dataproc warnings
Testless build
CDAP › DUT › #2748 2 days ago
log warnings in one message to ensure they are always together
CDAP-15599 log dataproc creation warnings
Log warnings that are encountered when create dataproc clusters.
Also change the default disk sizes so that they don't generate
a warning about disk performance.

Also add some logic when a delete requests to check if the cluster
state is already in the deleting state. This is done to avoid
logging a red herring in a specific edge case.
Merge pull request #11544 from cdapio/bugfix_release/CDAP-15599-log-dataproc-warnings
Bugfix release/cdap 15599 log dataproc warnings
1 of 1881 failed
CDAP › RUT › #947 2 days ago
Merge pull request #11544 from cdapio/bugfix_release/CDAP-15599-log-dataproc-warnings
Bugfix release/cdap 15599 log dataproc warnings
log warnings in one message to ensure they are always together
CDAP-15599 log dataproc creation warnings
Log warnings that are encountered when create dataproc clusters.
Also change the default disk sizes so that they don't generate
a warning about disk performance.

Also add some logic when a delete requests to check if the cluster
state is already in the deleting state. This is done to avoid
logging a red herring in a specific edge case.
2624 passed
CDAP › BPP › #966 2 days ago
Merge pull request #11544 from cdapio/bugfix_release/CDAP-15599-log-dataproc-warnings
Bugfix release/cdap 15599 log dataproc warnings
log warnings in one message to ensure they are always together
CDAP-15599 log dataproc creation warnings
Log warnings that are encountered when create dataproc clusters.
Also change the default disk sizes so that they don't generate
a warning about disk performance.

Also add some logic when a delete requests to check if the cluster
state is already in the deleting state. This is done to avoid
logging a red herring in a specific edge case.
Testless build
CDAP › DRC › #4687 2 days ago
log warnings in one message to ensure they are always together
Merge pull request #11544 from cdapio/bugfix_release/CDAP-15599-log-dataproc-warnings
Bugfix release/cdap 15599 log dataproc warnings
CDAP-15599 log dataproc creation warnings
Log warnings that are encountered when create dataproc clusters.
Also change the default disk sizes so that they don't generate
a warning about disk performance.

Also add some logic when a delete requests to check if the cluster
state is already in the deleting state. This is done to avoid
logging a red herring in a specific edge case.
Testless build
HYP › WT › #322 3 days ago
Merge pull request #357 from data-integrations/bump-version-4-1-0
bump to version 4.1.0
bump to version 4.1.0
Merge pull request #359 from data-integrations/bugfix_release/CDAP-15993-fix-parse-csv-col-whitespace
CDAP-15993 cleanse parse-as-csv column names
CDAP-15993 cleanse parse-as-csv column names
replaces whitespace with underscores in column names generated
from headers by parse-as-csv. This is because whitespace basically
makes future directives behave very strangely without errors.
1 of 296 failed
IT › UPD2 › #293 4 days ago
CDAP-16027 fix NPE when stage artifact is not configured
Merge pull request #11690 from cdapio/bugfix_release/CDAP-16027-fix-NPE-when-artifact-missing
CDAP-16027 fix NPE when stage artifact is not configured
Merge pull request #11683 from cdapio/bugfix_release/CDAP-15973-log-dataproc-failures
CDAP-15973 log cluster creation failures in dataproc provisioner
CDAP-15973 log cluster creation failures in dataproc provisioner
Enhanced the provisioner to log the creation failure error message
if it sees a failed cluster. Without this, the user will not know
why the cluster failed to create.
CDAP-15973 log a warning if multiple create operations found
Testless build
CDAP › URUT › #746 4 days ago
Merge pull request #11690 from cdapio/bugfix_release/CDAP-16027-fix-NPE-when-artifact-missing
CDAP-16027 fix NPE when stage artifact is not configured
CDAP-15973 log cluster creation failures in dataproc provisioner
Enhanced the provisioner to log the creation failure error message
if it sees a failed cluster. Without this, the user will not know
why the cluster failed to create.
CDAP-15973 log a warning if multiple create operations found
CDAP-16027 fix NPE when stage artifact is not configured
Merge pull request #11683 from cdapio/bugfix_release/CDAP-15973-log-dataproc-failures
CDAP-15973 log cluster creation failures in dataproc provisioner
Testless build
Build Completed Code commits Tests
CDAP › DUT › #2748 2 days ago
log warnings in one message to ensure they are always together
CDAP-15599 log dataproc creation warnings
Log warnings that are encountered when create dataproc clusters.
Also change the default disk sizes so that they don't generate
a warning about disk performance.

Also add some logic when a delete requests to check if the cluster
state is already in the deleting state. This is done to avoid
logging a red herring in a specific edge case.
Merge pull request #11544 from cdapio/bugfix_release/CDAP-15599-log-dataproc-warnings
Bugfix release/cdap 15599 log dataproc warnings
1 of 1881 failed
IT › UPD2 › #286 1 week ago
remove validator test
the validator plugin was removed, so removing the test as well
Merge pull request #1034 from cdapio/remove-validator-test
remove validator test
Testless build
CDAP › RUT › #945 1 week ago
Merge pull request #11674 from cdapio/bugfix_release/CDAP-15971-reorder-dataproc-properties
CDAP-15971 reorder CMEK property to be on bottom
Merge pull request #11672 from cdapio/bugfix_release/CDAP-15501-set-blockinterval
CDAP-15501 set block interval automatically
CDAP-15501 set block interval automatically
For most streaming pipelines, the default block interval of
200ms generates too many partitions. This is especially true of
file based sinks, which will write out a file for each part.

If the block interval is not explicitly set by the user, set it
to 20% of the batch interval to generate a smaller number of
partitions.
CDAP-15971 reorder CMEK property to be on bottom
Move the CMEK dataproc property to the bottom of the section and
change the bucket property to be of medium size. This fixes the
weird styling where everything is medium except for CMEK, which is
in the middle of the section.
1 of 1873 failed
IT › DC › #201 1 month ago
Merge pull request #1033 from cdapio/disable-yarn-mem-check
disable yarn pmem check
disable yarn pmem check
CDAP services are getting killed by yarn due to physical memory
constraints. Disable the pmem check to avoid this.
Testless build
IT › UPD2 › #254 1 month ago
Merge pull request #1033 from cdapio/disable-yarn-mem-check
disable yarn pmem check
disable yarn pmem check
CDAP services are getting killed by yarn due to physical memory
constraints. Disable the pmem check to avoid this.
Testless build
IT › IIT › #497 1 month ago
Merge pull request #1033 from cdapio/disable-yarn-mem-check
disable yarn pmem check
disable yarn pmem check
CDAP services are getting killed by yarn due to physical memory
constraints. Disable the pmem check to avoid this.
Testless build
CDAP › RUT › #939 1 month ago
Merge pull request #11614 from cdapio/bugfix-ui/CDAP-update-homepage-card-styling
reduce homepage card height
Merge pull request #11620 from cdapio/feature/CDAP-15282-fix-workflow-state
(CDAP-15282) Fix workflow state
2 of 1887 failed
CDAP › DUT › #2740 1 month ago
Merge pull request #11614 from cdapio/bugfix-ui/CDAP-update-homepage-card-styling
reduce homepage card height
Merge pull request #11620 from cdapio/feature/CDAP-15282-fix-workflow-state
(CDAP-15282) Fix workflow state
1 of 1897 failed
IT › UPD2 › #249 1 month ago
Merge pull request #11613 from cdapio/feature/CDAP-15265-allow-defaulting-dataproc-image
CDAP-15265 allow defaulting dataproc image through cdap site
Merge pull request #11615 from cdapio/feature/CDAP-15899-fix-provisioner-polling
CDAP-15899 fix polling interval for provisioning tasks
CDAP-15899 fix polling interval for provisioning tasks
CDAP-15265 allow defaulting dataproc image through cdap site
Testless build
CDAP › DUT › #2735 1 month ago
Merge pull request #11615 from cdapio/feature/CDAP-15899-fix-provisioner-polling
CDAP-15899 fix polling interval for provisioning tasks
CDAP-15899 fix polling interval for provisioning tasks
2624 passed
Build Completed Code commits Tests
CDAP › RUT › #946 4 days ago
CDAP-16027 fix NPE when stage artifact is not configured
Merge pull request #11690 from cdapio/bugfix_release/CDAP-16027-fix-NPE-when-artifact-missing
CDAP-16027 fix NPE when stage artifact is not configured
Merge pull request #11683 from cdapio/bugfix_release/CDAP-15973-log-dataproc-failures
CDAP-15973 log cluster creation failures in dataproc provisioner
CDAP-15973 log cluster creation failures in dataproc provisioner
Enhanced the provisioner to log the creation failure error message
if it sees a failed cluster. Without this, the user will not know
why the cluster failed to create.
CDAP-15973 log a warning if multiple create operations found
2624 passed
IT › DC › #209 1 week ago
remove validator test
the validator plugin was removed, so removing the test as well
Merge pull request #1034 from cdapio/remove-validator-test
remove validator test
42 passed
HYP › BAD › #262 2 weeks ago
CDAP-15917 remove very outdated validator plugin
throw exception in preparerun
Merge pull request #945 from cdapio/bugfix_release/CDAP-15428-md5-validation
CDAP-15428 fixed MD5 hasher plugin to check for field type
Merge pull request #944 from cdapio/feature_release/CDAP-15629-fix-get-schema
CDAP-15928 CDAP-15629 CDAP-15927 fix db source schema
Merge pull request #972 from cdapio/feature_release/CDAP-make-prepareRun-non-final
Make prepareRun() non final in AbstractFileSink
Merge pull request #977 from cdapio/bump-version-2-3-0
bump version to 2.3.0 and update submodule
CDAP-15428 fixed MD5 hasher plugin to check for field type
Fixed the md5 hasher plugin to validate that the fields specified
are strings and to fail if they are not.
bump version to 2.3.0 and update submodule
Merge pull request #956 from cdapio/feature/CDAP-15787-4-Aggregators
(CDAP-15787) 4 Aggregators
Merge pull request #961 from cdapio/feature/CDAP-15917-remove-validator
CDAP-15917 remove very outdated validator plugin
Merge pull request #978 from cdapio/merge-release
Merge release
Merge remote-tracking branch 'origin/release/2.3' into merge-release
CDAP-15928 CDAP-15629 CDAP-15927 fix db source schema
Removed the 'query' property that was only used in the previous
get schema implementation, which is no longer used. Instead,
use the import query to get the schema so that the schema
is properly fetched during stage validation.

This also fixes a bug where a pipeline could be deployed with an
empty import query, which would fail at runtime.

Also fixed a bug that would cause a null pointer exception when
no password is given.

Also fixed a bug in the query cleanup logic that strips the
$CONDITIONS clause from the import query when fetching the schema.
The previous logic was upper casing the entire query, which would
cause errors in case senstive DBs like HyperSQL, which is used in
unit tests. It also was not handling 'ors' in the where clauses.

Also removed a bunch of warnings in unit tests and changed a unit
test that was testing for runtime failures to instead test for
deployment failure since a bad query now will fail at deployment
instead of runtime.
Merge pull request #948 from cdapio/feature/CDAP-15787-Alert-Plugin
(CDAp-15787) Alert plugin
823 passed
IT › UPD2 › #265 3 weeks ago
Merge pull request #11656 from cdapio/bugfix_release/CDAP-15943-fix-remote-runs-with-ssl
CDAP-15943 fix remote runs when master is using internal ssl
Merge pull request #11641 from cdapio/feature/CDAP-15798-remove-misleading-warning
CDAP-15798 suppress warning for a normal scenario
Merge pull request #11632 from cdapio/bugfix_release/CDAP-15929-fix-missing-required-plugin-fields
CDAP-15929 add missing required fields to failure collector
CDAP-15905 evaluate macros in pipeline stage validation endpoint
Added secure macro evaluation to the pipeline stage validation
endpoint. This required changing the pipeline studio app into
a system service app, which also required changing the bootstrap
config to overwrite the existing studio app if it already exists
on startup, in order for it to automatically get updated when
CDAP is upgraded.

The validation endpoint doesn't want to throw an error if macros
couldn't be evaluated. Instead, it wants to use the original
string if any error is encountered. In order to support this,
refactored the MacroParser to take a MacroParserOptions object
that contains all of the current parser options as well as a new
option to skip evaluation of invalid macros. Enhanced the
system http context to allow the caller to pass in the parser
options when performing evaluation.
CDAP-15798 suppress warning for a normal scenario
It is normal for the RuntimeMonitor connection to take some time
to be established, so skip the first several warnings and only
warn once every 30 seconds after that. This avoids a string of
misleading warnings at the start of a remote run.
CDAP-15394 remove parallel sink writes by default
Added a runtime argument to enable parallel sink writes in Spark
pipelines, which defaults to false. This is because the parallel
writes causes Spark to re-process much of the pipeline, resulting
in very confusing metrics for most people.

In some situations, this will cause the pipeline to run slower
than it otherwise would have, but the default experience for
people should be better.
Merge pull request #11657 from cdapio/bugfix_release/CDAP-15944-skip-warning-on-normal-situation
CDAP-15944 skip the first 10 warnings in RuntimeMonitor
CDAP-15944 skip the first 10 warnings in RuntimeMonitor
It is normal to get errors at the start of the RuntimeMonitor
as it is waiting for the remote address to become available.
Skip the first 10 messages so that the misleading warnings are
not logged.
Merge pull request #11637 from cdapio/bugfix_release/CDAP-15905-evaluate-macros-in-validate
CDAP-15905 evaluate macros in pipeline stage validation endpoint
CDAP-15943 fix remote runs when master is using internal ssl
For remote runs, unset the ssl cert path so that it doesn't
try and use a cert/private key file that doesn't exist.

Also update the RemoteExecutionDiscoveryService to serialize
an entire Discoverable instead of just the address in order to
preserve the payload, and to correctly handle http/https based
on configuration.
Merge pull request #11643 from cdapio/bugfix_release/CDAP-15394-remove-parallel-sinks-by-default
CDAP-15394 remove parallel sink writes by default
Merge remote-tracking branch 'origin/release/6.1' into merge-release/6.1.0
Merge pull request #11670 from cdapio/merge-release/6.1.0
Merge release/6.1.0
CDAP-15929 add missing required fields to failure collector
Ensure that the validation endpoint returns explicit failures
for a missing required field instead of some generic error.

This required enhancing the CDAP API to throw an exception that
returns which required properties are missing and which properties
could not be assigned due to an invalid value.
Testless build
IT › DC › #200 1 month ago
CDAP-15879 test GCS formats
Modified the existing GCS test to test that all the formats can
be used, and to also test the usage of regex.
Merge pull request #1031 from cdapio/feature/CDAP-15879-gcs-formats-test
CDAP-15879 test GCS formats
43 passed
IT › UPD2 › #250 1 month ago
CDAP-15879 test GCS formats
Modified the existing GCS test to test that all the formats can
be used, and to also test the usage of regex.
Merge pull request #1031 from cdapio/feature/CDAP-15879-gcs-formats-test
CDAP-15879 test GCS formats
Testless build
IT › IIT › #493 1 month ago
CDAP-15879 test GCS formats
Modified the existing GCS test to test that all the formats can
be used, and to also test the usage of regex.
Merge pull request #1031 from cdapio/feature/CDAP-15879-gcs-formats-test
CDAP-15879 test GCS formats
47 passed
HYP › BAD › #258 1 month ago
Merge pull request #936 from cdapio/feature/CDAP-15794-fix-bad-delimited-error-message
CDAP-15794 use a better error message when delimited schema is wrong
CDAP-15794 use a better error message when delimited schema is wrong
Added an explicit error message in the delimited format when the
data has more fields than the schema. This replaces the previous
behavior, which would throw a NoSuchElementException.

Added a special check to see if the user meant to use the 'text'
format, as that seems to be a very common user error when the
file source is paired with wrangler.
812 passed
HYP › BAD › #255 1 month ago
Merge pull request #933 from cdapio/feature/CDAP-15879-fix-regex-path-filter
CDAP-15879 fix RegexPathFilter class loading
CDAP-15879 fix RegexPathFilter class loading
Fixed a bug where RegexPathFilter needed to be exported by any
plugins using the AbstractFileSource. It is a shared class that
should not be exported by multiple plugins, otherwise classloading
conflicts could arise.

The class gets instantiated by FileInputFormat when generating
splits. Fixed the format input formats to set the Configuration
classloader to the plugin classloader, which will be able to
create RegexPathFilter without exports since the class is packaged
in the plugin jar.

This also fixes a related bug, where the FileSystem used by
RegexPathFilter would only contain the filesystems deployed
directly on the cluster, and not the connectors packaged in the
plugins. This was causing the GCS plugins to fail
when run on clusters without the GCS connector installed.
814 passed
IT › UPD2 › #236 1 month ago
Merge pull request #1022 from AntonGensitskiy/bug/CDAP-15771
CDAP-15771: Added check for BigQuery bytes type.
Testless build