Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support streaming #14

Open
jeremyross opened this issue Apr 10, 2023 · 7 comments
Open

support streaming #14

jeremyross opened this issue Apr 10, 2023 · 7 comments

Comments

@jeremyross
Copy link

I'm having a little trouble determining what syntax to use for transforming CSV files of undetermined length. I'm trying this:

{
   firstName: payload.FIRST_NAME
}

But I get the error:

sjsonnet.Error: attempted to index a array with string FIRST_NAME

It looks like xtrasonnet is loading all rows into memory and then expecting the transformation to specify a row index. What I'd like to do is stream through the input CSV file and have it apply the above transformation to each row. Is this possible?

@jam01
Copy link
Owner

jam01 commented Apr 11, 2023

So the result of reading a CSV input is an array of objects (unless you specified header=absent). If you want to transform each one then you have to do a map operation.

xtr.map(payload, function(obj) {
   firstName: obj.FIRST_NAME
})

Unfortunately there's not way to stream and transform input at the moment. It's somewhat possible from sjsonnet implementation, but haven't dug in enough into implementating it.

@jeremyross
Copy link
Author

Ah, ok so it sounds like this would not be suitable for transforming a large stream of data since the entire stream/file is loaded into memory. Is that correct?

@jam01
Copy link
Owner

jam01 commented Apr 11, 2023

Correct.

@jeremyross
Copy link
Author

Is there any interest in the streaming use case? Would open up lots of uses cases for integration.

@jam01
Copy link
Owner

jam01 commented Apr 11, 2023

Definitely. It's something I plan to tackle at some point, but can't offer any ETA at the moment. Mostly because of lack of free time, but also haven't thought through the architecture of a solution.

@jeremyross
Copy link
Author

I still haven't been able to get a basic map working. e.g.:

// body is List<Map<String, Object>>
.transform(xtrasonnet("""
   {
        Name: payload.Name                    
   }
""", String.class)
.bodyMediaType(MediaTypes.APPLICATION_JAVA)
.outputMediaType(MediaTypes.TEXT_CSV))

results in

sjsonnet.Error: attempted to index a array with string Name
	at [Select Name].((main):2:22) ~[na:na]

I've also tried:

[
   {
        Name: payload.Name                    
   }
]

But I get the same result. What am I doing wrong?

@jam01
Copy link
Owner

jam01 commented May 31, 2023

Your payload is an array, so you have to do something like

xtr.map(payload, function(it) {
  Name: it.Name
})

@jam01 jam01 changed the title Transforming large CSV files support streaming Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants