Build #3,022

Cask Data Application Platform - Develop Build and Test

Build: #3022 failed Child of CDAP-DRC-5031

Build result summary


36 minutes
b1f303d704750c8a8fad879020e8a874ea4d8eff b1f303d704750c8a8fad879020e8a874ea4d8eff
Total tests
Fixed in
#3023 (Child of CDAP-DRC-5032)


No one has taken responsibility for this failure

Code commits

Author Commit Message Commit date
albertshau <> albertshau <> b1f303d704750c8a8fad879020e8a874ea4d8eff b1f303d704750c8a8fad879020e8a874ea4d8eff Merge pull request #12558 from cdapio/feature/CDAP-17078-spark-stage-consolidation
CDAP-17078 consolidate stages within a group
albertshau <> albertshau <> 7e4f1e1fe9cf01b9fa3efeb05b3c76865a0633bb m 7e4f1e1fe9cf01b9fa3efeb05b3c76865a0633bb CDAP-17078 consolidate stages within a group
Changed the SparkPipelineRunner to use a CombinerDag to group
sinks and their preceding transforms together. These grouped
stages are treated similarly to how a single sink is treated,
with flatMapToPair() called on the input RDD to transform it
into a PairRDD, then calling save() to write the RDD out.
This capability is off by default, but can be turned on by
setting a runtime argument.

Instead of flatMapToPair() calling just the sink's transform
method, a new MultiSinkFunction class is used to direct incoming
records to the correct logical branches of the pipeline.
This requires that each input be tagged with which stage it
came from (stage and port), as well as its type (output, or error).
In order to do this, refactored the SparkPipelineRunner a bit
to maintain the RDD<RecordInfo> for each stage rather than
RDD<StructuredRecord>, as the RecordInfo class contains that
extra information.

Also added a MultiOutputFormat that will take the output of the
MultiSinkFunction and delegate writes to the correct underlying
OutputFormat. Since the OutputFormat lives in the pipeline
app, this approach means CDAP datasets cannot be combined.
This caused a problem with dataset lineage, since it is
implemented by implemented by wrapping OutputFormats into a hidden
ExternalDataset class in CDAP. Instead of doing this indirect
wrapping, changed the SparkSinkFactory class to explicitly
register lineage through direct calls instead of hiding it
under several layers of abstraction.
sagarkapare <> sagarkapare <> 3b90e4091934ebdcb8353dd87c10a47d7855a51d 3b90e4091934ebdcb8353dd87c10a47d7855a51d Merge pull request #12524 from cdapio/feature/CDAP-16712-remote-fetcher-impl
[CDAP-16712] Separate out preview manager and preview runners so that they can be run independently in their own containers.
Jenna Choi <> Jenna Choi <> 5686de64c4e234f11ec9c6bffab375fc5257502a 5686de64c4e234f11ec9c6bffab375fc5257502a Merge pull request #12573 from cdapio/feature-ui/CDAP-17167
[CDAP-17167] Resolve CDAP UI unit test failure
Jenna Choi <> Jenna Choi <> a89a829ffaef4ac72db9846a23c66dfdf38e0f46 a89a829ffaef4ac72db9846a23c66dfdf38e0f46 [CDAP-17167] Resolve CDAP UI unittest failure


New test failures 1
Status Test View job Duration
Collapse Failed SqlAppMetadataStoreTest History
Unit Test Job < 1 sec Gave up waiting for server to start after 10000ms
	at com.opentable.db.postgres.embedded.EmbeddedPostgres.waitForServerStartup(
	at com.opentable.db.postgres.embedded.EmbeddedPostgres.startPostmaster(
	at com.opentable.db.postgres.embedded.EmbeddedPostgres.<init>(
	at com.opentable.db.postgres.embedded.EmbeddedPostgres$Builder.start(
(42 more lines...)

JIRA issues

Unknown Issue TypeCDAP-16712Could not obtain issue details from JIRA
Unknown Issue TypeCDAP-17078Could not obtain issue details from JIRA
Unknown Issue TypeCDAP-17157Could not obtain issue details from JIRA
Unknown Issue TypeCDAP-17167Could not obtain issue details from JIRA