albertshau <ashau@google.com>: Author Summary

Builds triggered by albertshau <ashau@google.com>

Builds triggered by an author are those builds which contains changes committed by the author.
955
288 (30%)
667 (70%)

Breakages and fixes

Broken means the build has failed but the previous build was successful.
Fixed means that the build was successful but the previous build has failed.
86 (9% of all builds triggered)
75 (8% of all builds triggered)
-11
Build Completed Code commits Tests
IT › UPD2 › #321 3 weeks ago
CDAP-16110 limit reads for batch pipeline previews
Fixed a bug in batch pipeline previews where the full input data
was still being read, though not all of it was being output.

Wrapped the input format provided by sources to properly limit
data read by the RecordReaders instead of only limiting the records
output to subsequent stages.
Merge pull request #11730 from cdapio/bugfix_release/CDAP-16110-pipeline-preview-limit
CDAP-16110 limit reads for batch pipeline previews
Testless build
CDAP › DUT › #2756 3 weeks ago
Merge pull request #11730 from cdapio/bugfix_release/CDAP-16110-pipeline-preview-limit
CDAP-16110 limit reads for batch pipeline previews
CDAP-16110 limit reads for batch pipeline previews
Fixed a bug in batch pipeline previews where the full input data
was still being read, though not all of it was being output.

Wrapped the input format provided by sources to properly limit
data read by the RecordReaders instead of only limiting the records
output to subsequent stages.
2626 passed
CDAP › RUT › #954 3 weeks ago
Merge pull request #11730 from cdapio/bugfix_release/CDAP-16110-pipeline-preview-limit
CDAP-16110 limit reads for batch pipeline previews
CDAP-16110 limit reads for batch pipeline previews
Fixed a bug in batch pipeline previews where the full input data
was still being read, though not all of it was being output.

Wrapped the input format provided by sources to properly limit
data read by the RecordReaders instead of only limiting the records
output to subsequent stages.
2626 passed
CDAP › URUT › #755 3 weeks ago
CDAP-16110 limit reads for batch pipeline previews
Fixed a bug in batch pipeline previews where the full input data
was still being read, though not all of it was being output.

Wrapped the input format provided by sources to properly limit
data read by the RecordReaders instead of only limiting the records
output to subsequent stages.
Merge pull request #11730 from cdapio/bugfix_release/CDAP-16110-pipeline-preview-limit
CDAP-16110 limit reads for batch pipeline previews
Testless build
CDAP › UDUT › #746 3 weeks ago
CDAP-16110 limit reads for batch pipeline previews
Fixed a bug in batch pipeline previews where the full input data
was still being read, though not all of it was being output.

Wrapped the input format provided by sources to properly limit
data read by the RecordReaders instead of only limiting the records
output to subsequent stages.
Merge pull request #11730 from cdapio/bugfix_release/CDAP-16110-pipeline-preview-limit
CDAP-16110 limit reads for batch pipeline previews
Testless build
CDAP › DRC › #4695 3 weeks ago
CDAP-16110 limit reads for batch pipeline previews
Fixed a bug in batch pipeline previews where the full input data
was still being read, though not all of it was being output.

Wrapped the input format provided by sources to properly limit
data read by the RecordReaders instead of only limiting the records
output to subsequent stages.
Merge pull request #11730 from cdapio/bugfix_release/CDAP-16110-pipeline-preview-limit
CDAP-16110 limit reads for batch pipeline previews
Testless build
IT › UPD2 › #314 1 month ago
CDAP-16069 fix FLL performance
Fixed the logic around calculating incoming field lineage to
avoid re-computation. This avoids an amount of re-computation that
is exponential in the number of field connections. Instead, it
is exponential in the number of datasets, which is many
orders of magnitude smaller than field connections.
Testless build
CDAP › RUT › #951 1 month ago
CDAP-16069 fix FLL performance
Fixed the logic around calculating incoming field lineage to
avoid re-computation. This avoids an amount of re-computation that
is exponential in the number of field connections. Instead, it
is exponential in the number of datasets, which is many
orders of magnitude smaller than field connections.
2624 passed
CDAP › UDUT › #742 1 month ago
CDAP-16069 fix FLL performance
Fixed the logic around calculating incoming field lineage to
avoid re-computation. This avoids an amount of re-computation that
is exponential in the number of field connections. Instead, it
is exponential in the number of datasets, which is many
orders of magnitude smaller than field connections.
Testless build
CDAP › DUT › #2752 1 month ago
CDAP-16069 fix FLL performance
Fixed the logic around calculating incoming field lineage to
avoid re-computation. This avoids an amount of re-computation that
is exponential in the number of field connections. Instead, it
is exponential in the number of datasets, which is many
orders of magnitude smaller than field connections.
1 of 1871 failed
Build Completed Code commits Tests
IT › UPD2 › #321 3 weeks ago
CDAP-16110 limit reads for batch pipeline previews
Fixed a bug in batch pipeline previews where the full input data
was still being read, though not all of it was being output.

Wrapped the input format provided by sources to properly limit
data read by the RecordReaders instead of only limiting the records
output to subsequent stages.
Merge pull request #11730 from cdapio/bugfix_release/CDAP-16110-pipeline-preview-limit
CDAP-16110 limit reads for batch pipeline previews
Testless build
CDAP › DUT › #2752 1 month ago
CDAP-16069 fix FLL performance
Fixed the logic around calculating incoming field lineage to
avoid re-computation. This avoids an amount of re-computation that
is exponential in the number of field connections. Instead, it
is exponential in the number of datasets, which is many
orders of magnitude smaller than field connections.
1 of 1871 failed
IT › SAN › #186 1 month ago
Merge pull request #1042 from yeweidaniel/integration-docs
Update steps for running integration test
1 of 42 failed
CDAP › DUT › #2748 1 month ago
log warnings in one message to ensure they are always together
CDAP-15599 log dataproc creation warnings
Log warnings that are encountered when create dataproc clusters.
Also change the default disk sizes so that they don't generate
a warning about disk performance.

Also add some logic when a delete requests to check if the cluster
state is already in the deleting state. This is done to avoid
logging a red herring in a specific edge case.
Merge pull request #11544 from cdapio/bugfix_release/CDAP-15599-log-dataproc-warnings
Bugfix release/cdap 15599 log dataproc warnings
1 of 1881 failed
IT › UPD2 › #286 1 month ago
remove validator test
the validator plugin was removed, so removing the test as well
Merge pull request #1034 from cdapio/remove-validator-test
remove validator test
Testless build
CDAP › RUT › #945 2 months ago
Merge pull request #11674 from cdapio/bugfix_release/CDAP-15971-reorder-dataproc-properties
CDAP-15971 reorder CMEK property to be on bottom
Merge pull request #11672 from cdapio/bugfix_release/CDAP-15501-set-blockinterval
CDAP-15501 set block interval automatically
CDAP-15501 set block interval automatically
For most streaming pipelines, the default block interval of
200ms generates too many partitions. This is especially true of
file based sinks, which will write out a file for each part.

If the block interval is not explicitly set by the user, set it
to 20% of the batch interval to generate a smaller number of
partitions.
CDAP-15971 reorder CMEK property to be on bottom
Move the CMEK dataproc property to the bottom of the section and
change the bucket property to be of medium size. This fixes the
weird styling where everything is medium except for CMEK, which is
in the middle of the section.
1 of 1873 failed
IT › DC › #201 2 months ago
Merge pull request #1033 from cdapio/disable-yarn-mem-check
disable yarn pmem check
disable yarn pmem check
CDAP services are getting killed by yarn due to physical memory
constraints. Disable the pmem check to avoid this.
Testless build
IT › UPD2 › #254 2 months ago
Merge pull request #1033 from cdapio/disable-yarn-mem-check
disable yarn pmem check
disable yarn pmem check
CDAP services are getting killed by yarn due to physical memory
constraints. Disable the pmem check to avoid this.
Testless build
IT › IIT › #497 2 months ago
Merge pull request #1033 from cdapio/disable-yarn-mem-check
disable yarn pmem check
disable yarn pmem check
CDAP services are getting killed by yarn due to physical memory
constraints. Disable the pmem check to avoid this.
Testless build
CDAP › DUT › #2740 2 months ago
Merge pull request #11614 from cdapio/bugfix-ui/CDAP-update-homepage-card-styling
reduce homepage card height
Merge pull request #11620 from cdapio/feature/CDAP-15282-fix-workflow-state
(CDAP-15282) Fix workflow state
1 of 1897 failed
Build Completed Code commits Tests
CDAP › RUT › #954 3 weeks ago
Merge pull request #11730 from cdapio/bugfix_release/CDAP-16110-pipeline-preview-limit
CDAP-16110 limit reads for batch pipeline previews
CDAP-16110 limit reads for batch pipeline previews
Fixed a bug in batch pipeline previews where the full input data
was still being read, though not all of it was being output.

Wrapped the input format provided by sources to properly limit
data read by the RecordReaders instead of only limiting the records
output to subsequent stages.
2626 passed
IT › SAN › #182 1 month ago
Merge pull request #1038 from AntonGensitskiy/PLUGIN-82
PLUGIN-82: Added new tests to cover dedupe functionality.
42 passed
IT › IIT › #531 1 month ago
Merge pull request #1038 from AntonGensitskiy/PLUGIN-82
PLUGIN-82: Added new tests to cover dedupe functionality.
46 passed
CDAP › RUT › #946 1 month ago
CDAP-16027 fix NPE when stage artifact is not configured
Merge pull request #11690 from cdapio/bugfix_release/CDAP-16027-fix-NPE-when-artifact-missing
CDAP-16027 fix NPE when stage artifact is not configured
Merge pull request #11683 from cdapio/bugfix_release/CDAP-15973-log-dataproc-failures
CDAP-15973 log cluster creation failures in dataproc provisioner
CDAP-15973 log cluster creation failures in dataproc provisioner
Enhanced the provisioner to log the creation failure error message
if it sees a failed cluster. Without this, the user will not know
why the cluster failed to create.
CDAP-15973 log a warning if multiple create operations found
2624 passed
IT › DC › #209 1 month ago
remove validator test
the validator plugin was removed, so removing the test as well
Merge pull request #1034 from cdapio/remove-validator-test
remove validator test
42 passed
HYP › BAD › #262 2 months ago
CDAP-15917 remove very outdated validator plugin
throw exception in preparerun
Merge pull request #945 from cdapio/bugfix_release/CDAP-15428-md5-validation
CDAP-15428 fixed MD5 hasher plugin to check for field type
Merge pull request #944 from cdapio/feature_release/CDAP-15629-fix-get-schema
CDAP-15928 CDAP-15629 CDAP-15927 fix db source schema
Merge pull request #972 from cdapio/feature_release/CDAP-make-prepareRun-non-final
Make prepareRun() non final in AbstractFileSink
Merge pull request #977 from cdapio/bump-version-2-3-0
bump version to 2.3.0 and update submodule
CDAP-15428 fixed MD5 hasher plugin to check for field type
Fixed the md5 hasher plugin to validate that the fields specified
are strings and to fail if they are not.
bump version to 2.3.0 and update submodule
Merge pull request #956 from cdapio/feature/CDAP-15787-4-Aggregators
(CDAP-15787) 4 Aggregators
Merge pull request #961 from cdapio/feature/CDAP-15917-remove-validator
CDAP-15917 remove very outdated validator plugin
Merge pull request #978 from cdapio/merge-release
Merge release
Merge remote-tracking branch 'origin/release/2.3' into merge-release
CDAP-15928 CDAP-15629 CDAP-15927 fix db source schema
Removed the 'query' property that was only used in the previous
get schema implementation, which is no longer used. Instead,
use the import query to get the schema so that the schema
is properly fetched during stage validation.

This also fixes a bug where a pipeline could be deployed with an
empty import query, which would fail at runtime.

Also fixed a bug that would cause a null pointer exception when
no password is given.

Also fixed a bug in the query cleanup logic that strips the
$CONDITIONS clause from the import query when fetching the schema.
The previous logic was upper casing the entire query, which would
cause errors in case senstive DBs like HyperSQL, which is used in
unit tests. It also was not handling 'ors' in the where clauses.

Also removed a bunch of warnings in unit tests and changed a unit
test that was testing for runtime failures to instead test for
deployment failure since a bad query now will fail at deployment
instead of runtime.
Merge pull request #948 from cdapio/feature/CDAP-15787-Alert-Plugin
(CDAp-15787) Alert plugin
823 passed
IT › UPD2 › #265 2 months ago
Merge pull request #11656 from cdapio/bugfix_release/CDAP-15943-fix-remote-runs-with-ssl
CDAP-15943 fix remote runs when master is using internal ssl
Merge pull request #11641 from cdapio/feature/CDAP-15798-remove-misleading-warning
CDAP-15798 suppress warning for a normal scenario
Merge pull request #11632 from cdapio/bugfix_release/CDAP-15929-fix-missing-required-plugin-fields
CDAP-15929 add missing required fields to failure collector
CDAP-15905 evaluate macros in pipeline stage validation endpoint
Added secure macro evaluation to the pipeline stage validation
endpoint. This required changing the pipeline studio app into
a system service app, which also required changing the bootstrap
config to overwrite the existing studio app if it already exists
on startup, in order for it to automatically get updated when
CDAP is upgraded.

The validation endpoint doesn't want to throw an error if macros
couldn't be evaluated. Instead, it wants to use the original
string if any error is encountered. In order to support this,
refactored the MacroParser to take a MacroParserOptions object
that contains all of the current parser options as well as a new
option to skip evaluation of invalid macros. Enhanced the
system http context to allow the caller to pass in the parser
options when performing evaluation.
CDAP-15798 suppress warning for a normal scenario
It is normal for the RuntimeMonitor connection to take some time
to be established, so skip the first several warnings and only
warn once every 30 seconds after that. This avoids a string of
misleading warnings at the start of a remote run.
CDAP-15394 remove parallel sink writes by default
Added a runtime argument to enable parallel sink writes in Spark
pipelines, which defaults to false. This is because the parallel
writes causes Spark to re-process much of the pipeline, resulting
in very confusing metrics for most people.

In some situations, this will cause the pipeline to run slower
than it otherwise would have, but the default experience for
people should be better.
Merge pull request #11657 from cdapio/bugfix_release/CDAP-15944-skip-warning-on-normal-situation
CDAP-15944 skip the first 10 warnings in RuntimeMonitor
CDAP-15944 skip the first 10 warnings in RuntimeMonitor
It is normal to get errors at the start of the RuntimeMonitor
as it is waiting for the remote address to become available.
Skip the first 10 messages so that the misleading warnings are
not logged.
Merge pull request #11637 from cdapio/bugfix_release/CDAP-15905-evaluate-macros-in-validate
CDAP-15905 evaluate macros in pipeline stage validation endpoint
CDAP-15943 fix remote runs when master is using internal ssl
For remote runs, unset the ssl cert path so that it doesn't
try and use a cert/private key file that doesn't exist.

Also update the RemoteExecutionDiscoveryService to serialize
an entire Discoverable instead of just the address in order to
preserve the payload, and to correctly handle http/https based
on configuration.
Merge pull request #11643 from cdapio/bugfix_release/CDAP-15394-remove-parallel-sinks-by-default
CDAP-15394 remove parallel sink writes by default
Merge remote-tracking branch 'origin/release/6.1' into merge-release/6.1.0
Merge pull request #11670 from cdapio/merge-release/6.1.0
Merge release/6.1.0
CDAP-15929 add missing required fields to failure collector
Ensure that the validation endpoint returns explicit failures
for a missing required field instead of some generic error.

This required enhancing the CDAP API to throw an exception that
returns which required properties are missing and which properties
could not be assigned due to an invalid value.
Testless build
IT › DC › #200 2 months ago
CDAP-15879 test GCS formats
Modified the existing GCS test to test that all the formats can
be used, and to also test the usage of regex.
Merge pull request #1031 from cdapio/feature/CDAP-15879-gcs-formats-test
CDAP-15879 test GCS formats
43 passed
IT › UPD2 › #250 2 months ago
CDAP-15879 test GCS formats
Modified the existing GCS test to test that all the formats can
be used, and to also test the usage of regex.
Merge pull request #1031 from cdapio/feature/CDAP-15879-gcs-formats-test
CDAP-15879 test GCS formats
Testless build
IT › IIT › #493 2 months ago
CDAP-15879 test GCS formats
Modified the existing GCS test to test that all the formats can
be used, and to also test the usage of regex.
Merge pull request #1031 from cdapio/feature/CDAP-15879-gcs-formats-test
CDAP-15879 test GCS formats
47 passed