Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigtableIO.Read: use PBegin, rather than PInput #18

Closed
wants to merge 1 commit into from

Conversation

dhalperi
Copy link
Contributor

@dhalperi dhalperi commented Mar 4, 2016

Sources should start from the beginning of a pipeline.

Sources should start from the beginning of a pipeline.
@dhalperi
Copy link
Contributor Author

dhalperi commented Mar 4, 2016

any R will do @lukecwik @kennknowles @tgroh

@dhalperi
Copy link
Contributor Author

dhalperi commented Mar 4, 2016

backwards-incompatible, but has not been released to Maven yet so it should be okay. Users should not be affected unless they're using this wildly incorrectly.

@davorbonaci
Copy link
Member

SG, but let's make sure it gets to 1.5 of Dataflow.

@lukecwik
Copy link
Member

lukecwik commented Mar 4, 2016

LGTM

@asfgit asfgit closed this in 419b6f4 Mar 4, 2016
davorbonaci added a commit to GoogleCloudPlatform/DataflowJavaSDK that referenced this pull request Mar 4, 2016
@dhalperi dhalperi deleted the bigtable-pbegin branch March 7, 2016 19:09
aljoscha pushed a commit to aljoscha/beam that referenced this pull request Mar 17, 2018
Rewrite Read as Impulse() | ParDo(Split) | ParDo(Read) in Python SDK
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
dmvk pushed a commit to dmvk/beam that referenced this pull request May 15, 2018
dmvk pushed a commit to dmvk/beam that referenced this pull request May 15, 2018
dmvk pushed a commit to dmvk/beam that referenced this pull request May 15, 2018
tvalentyn pushed a commit to tvalentyn/beam that referenced this pull request May 15, 2018
mareksimunek referenced this pull request in seznam/beam Jul 9, 2018
kennknowles pushed a commit that referenced this pull request Oct 16, 2018
mxm pushed a commit to mxm/beam that referenced this pull request Sep 16, 2019
* PRICING-11953: Enable cython36 build

* updated venv path

* updated the makefile command

* update docker and makefile

* udpate version
pabloem pushed a commit to pabloem/beam that referenced this pull request Feb 13, 2021
* New DebeziumIO class.

* Merge connector code

* DebeziumIO and MySqlConnector integrated.

* Added FormatFuntion param to Read builder on DebeziumIO.

* Added arguments checker to DebeziumIO.

* Add simple JSON mapper object (#1)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

Co-authored-by: osvaldo-salinas <osvaldo.salinas@wizeline.com>
Co-authored-by: Carlos Dominguez <carlos.dominguez@carlos.dominguez>
Co-authored-by: Carlos Domínguez <carlos.dominguez@wizeline.com>

* Add debeziumio tests

* Debeziumio testing json mapper (#3)

* Some code refactors. Use a default DBHistory if not provided

* Add basic tests for Json mapper

* Debeziumio time restriction (apache#5)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

* Some code refactors. Use a default DBHistory if not provided

* Adding based-time restriction

Stop polling after specified amount of time

* Add basic tests for Json mapper

* Adding new restriction

Uses a time-based restriction

* Adding optional restrcition

Uses an optional time-based restriction

Co-authored-by: juanitodread <juanitodread@gmail.com>
Co-authored-by: osvaldo-salinas <osvaldo.salinas@wizeline.com>

* Upgrade DebeziumIO connector (apache#4)

* Address comments (Change dependencies to testCompile, Set JsonMapper/Coder as default, refactors) (apache#8)

* Revert file

* Change dependencies to testCompile
* Move Counter sample to unit test

* Set JsonMapper as default mapper function
* Set String Coder as default coder when using JsonMapper
* Change logs from info to debug

* Debeziumio javadoc (apache#9)

* Adding javadoc

* Added some titles and examples

* Added SourceRecordJson doc

* Added Basic Connector doc

* Added KafkaSourceConsumer doc

* Javadoc cleanup

* Removing BasicConnector

No usages of this class were found overall

* Editing documentation

* Debeziumio fetched records restriction (apache#10)

* Adding javadoc

* Adding restriction by number of fetched records

Also adding a quick-fix for null value within SourceRecords
Minor fix on both MySQL and PostgreSQL Connectors Tests

* Run either by time or by number of records

* Added DebeziumOffsetTrackerTest

Tests both restrictions: By amount of time and by Number of records

* Removing comment

* DebeziumIO test for DB2. (apache#11)

* DebeziumIO test for DB2.

* DebeziumIO javadoc.

* Clean code:removed commented code lines on DebeziumIOConnectorTest.java

* Clean code:removing unused imports and using readAsJson().

Co-authored-by: Carlos Domínguez <74681048+carlosdominguezwl@users.noreply.github.com>

* Debezium limit records (now configurable) (apache#12)

* Adding javadoc

* Records Limit is now configurable

(It was fixed before)

* Debeziumio dockerize (apache#13)

* Add mysql docker container to tests

* Move debezium mysql integration test to its own file

* Add assertion to verify that the results contains a record.

* Debeziumio readme (apache#15)

* Adding javadoc

* Adding README file

* Add number of records configuration to the DebeziumIO component (apache#16)

* Code refactors (apache#17)

* Remove/ignore null warnings

* Remove DB2 code

* Remove docker dependency in DebeziumIO unit test and max number of recods to MySql integration test

* Change access modifiers accordingly

* Remove incomplete integration tests (Postgres and SqlServer)

* Add experimenal tag

* Debezium testing stoppable consumer (apache#18)

* Add try-catch-finally, stop SourceTask at finally.

* Fix warnings

* stopConsumer and processedRecords local variables removed. UT for task stop use case added

* Fix minor code style issue

Co-authored-by: juanitodread <juanitodread@gmail.com>

* Fix style issues (check, spotlessApply) (apache#19)

Co-authored-by: Osvaldo Salinas <osvaldo.salinas@osvaldo.salinas>
Co-authored-by: alejandro.maguey <alejandro.maguey@wizeline.com>
Co-authored-by: osvaldo-salinas <osvaldo.salinas@wizeline.com>
Co-authored-by: Carlos Dominguez <carlos.dominguez@carlos.dominguez>
Co-authored-by: Carlos Domínguez <carlos.dominguez@wizeline.com>
Co-authored-by: Carlos Domínguez <74681048+carlosdominguezwl@users.noreply.github.com>
Co-authored-by: Alejandro Maguey <alexmaguey1@gmail.com>
Co-authored-by: Hassan Reyes <hassanreyes@users.noreply.github.com>
pabloem pushed a commit that referenced this pull request Feb 17, 2021
Debeziumio PoC (#7)

* New DebeziumIO class.

* Merge connector code

* DebeziumIO and MySqlConnector integrated.

* Added FormatFuntion param to Read builder on DebeziumIO.

* Added arguments checker to DebeziumIO.

* Add simple JSON mapper object (#1)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

Co-authored-by: osvaldo-salinas <osvaldo.salinas@wizeline.com>
Co-authored-by: Carlos Dominguez <carlos.dominguez@carlos.dominguez>
Co-authored-by: Carlos Domínguez <carlos.dominguez@wizeline.com>

* Add debeziumio tests

* Debeziumio testing json mapper (#3)

* Some code refactors. Use a default DBHistory if not provided

* Add basic tests for Json mapper

* Debeziumio time restriction (#5)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

* Some code refactors. Use a default DBHistory if not provided

* Adding based-time restriction

Stop polling after specified amount of time

* Add basic tests for Json mapper

* Adding new restriction

Uses a time-based restriction

* Adding optional restrcition

Uses an optional time-based restriction

Co-authored-by: juanitodread <juanitodread@gmail.com>
Co-authored-by: osvaldo-salinas <osvaldo.salinas@wizeline.com>

* Upgrade DebeziumIO connector (#4)

* Address comments (Change dependencies to testCompile, Set JsonMapper/Coder as default, refactors) (#8)

* Revert file

* Change dependencies to testCompile
* Move Counter sample to unit test

* Set JsonMapper as default mapper function
* Set String Coder as default coder when using JsonMapper
* Change logs from info to debug

* Debeziumio javadoc (#9)

* Adding javadoc

* Added some titles and examples

* Added SourceRecordJson doc

* Added Basic Connector doc

* Added KafkaSourceConsumer doc

* Javadoc cleanup

* Removing BasicConnector

No usages of this class were found overall

* Editing documentation

* Debeziumio fetched records restriction (#10)

* Adding javadoc

* Adding restriction by number of fetched records

Also adding a quick-fix for null value within SourceRecords
Minor fix on both MySQL and PostgreSQL Connectors Tests

* Run either by time or by number of records

* Added DebeziumOffsetTrackerTest

Tests both restrictions: By amount of time and by Number of records

* Removing comment

* DebeziumIO test for DB2. (#11)

* DebeziumIO test for DB2.

* DebeziumIO javadoc.

* Clean code:removed commented code lines on DebeziumIOConnectorTest.java

* Clean code:removing unused imports and using readAsJson().

Co-authored-by: Carlos Domínguez <74681048+carlosdominguezwl@users.noreply.github.com>

* Debezium limit records (now configurable) (#12)

* Adding javadoc

* Records Limit is now configurable

(It was fixed before)

* Debeziumio dockerize (#13)

* Add mysql docker container to tests

* Move debezium mysql integration test to its own file

* Add assertion to verify that the results contains a record.

* Debeziumio readme (#15)

* Adding javadoc

* Adding README file

* Add number of records configuration to the DebeziumIO component (#16)

* Code refactors (#17)

* Remove/ignore null warnings

* Remove DB2 code

* Remove docker dependency in DebeziumIO unit test and max number of recods to MySql integration test

* Change access modifiers accordingly

* Remove incomplete integration tests (Postgres and SqlServer)

* Add experimenal tag

* Debezium testing stoppable consumer (#18)

* Add try-catch-finally, stop SourceTask at finally.

* Fix warnings

* stopConsumer and processedRecords local variables removed. UT for task stop use case added

* Fix minor code style issue

Co-authored-by: juanitodread <juanitodread@gmail.com>

* Fix style issues (check, spotlessApply) (#19)

Co-authored-by: Osvaldo Salinas <osvaldo.salinas@osvaldo.salinas>
Co-authored-by: alejandro.maguey <alejandro.maguey@wizeline.com>
Co-authored-by: osvaldo-salinas <osvaldo.salinas@wizeline.com>
Co-authored-by: Carlos Dominguez <carlos.dominguez@carlos.dominguez>
Co-authored-by: Carlos Domínguez <carlos.dominguez@wizeline.com>
Co-authored-by: Carlos Domínguez <74681048+carlosdominguezwl@users.noreply.github.com>
Co-authored-by: Alejandro Maguey <alexmaguey1@gmail.com>
Co-authored-by: Hassan Reyes <hassanreyes@users.noreply.github.com>

Add missing apache license to README.md

Enabling integration test for DebeziumIO (#20)

Rename connector package cdc=>debezium. Update doc references (#21)

Fix code style on DebeziumIOMySqlConnectorIT
hengfengli referenced this pull request in hengfengli/beam Mar 21, 2022
* refactor: centralises initial partition checks

Creates a InitialPartition class to do any verifications related to the
initial partition.

* fix: allow nulls in mods for data records

Old values and new values can be null in a data record.

* fix: allow nulls as values in mods

Old values and new values in mods can contain null values that need to
be properly serialized with Avro. We added a schema definition that
allows for such case here.

* fix: fix partition tracker trySplit

* feat: adds stop mode for partition position

* feat: remove initial partition logic from tracker

Removes the special casing for the initial partition in the restriction
tracker. We only allow splits when there is an claim for a timestamp
that is at least 1 second in the future.

* feat: add timestamp converter class

* feat: extract partition restriction splitter class

This will be responsible for encapsulating the logic to trying to split
a certain restriction.

* chore: add reminder to add tests

* feat: extract restriction claimer class

* test: add test for timestamp converter class

* test: add test to restriction claimer

We also start checking for preconditions here.

* test: add test for restriction splitter

* feat: add split checker for restriction

* test: add test for restriction tracker

* fix: fix partition restriction constructors

Fixes the wait for child partitions restriction constructor to send the
children to wait for and adds tests.

* feat: add logging to the restriction split checker

* feat: add logging to the restriction splitter

* fix: fix restriction claim logic

Fixes the restriction claimer to consider the last claimed position in
addition to the restriction. Fixes the restriction claimer to always
return false when trying to claim from a STOP mode.

* fix: increase default heartbeat interval

The default of 1 second was returning an error.

* fix: allow claiming end of interval

* test: fix spanner change stream it test

* feat: changes the log level of change streams

To info level for now

* chore: formats the code
kileys pushed a commit that referenced this pull request Mar 23, 2022
* Upgrade: removing deprecated initContainer tags and adding selector to deployments in elasticsearch k8s files.

Co-authored-by: Elias Segundo <elias.segundo@luisrazo.local>
sjvanrossum pushed a commit to sjvanrossum/beam that referenced this pull request May 17, 2023
Revert "Merge pull request apache#16 from dahlbaek/forbid-unsafe"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants