Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using buffer for writing to a file during preprocessing #260

Open
bhanu77prakash opened this issue Apr 13, 2022 · 0 comments
Open

Using buffer for writing to a file during preprocessing #260

bhanu77prakash opened this issue Apr 13, 2022 · 0 comments

Comments

@bhanu77prakash
Copy link

In the data preprocessing code, there is a function that write a triple to a file

def write_triple(f, ent, rel, t, S, P, O):
    """Write a triple to a file. """
    f.write(str(ent[t[S]]) + "\t" + str(rel[t[P]]) + "\t" + str(ent[t[O]]) + "\n")

I think writing this way would take a lot of time when you deal with 100s of millions of relations. An ideal method would be to maintain a buffer (e.g. a string) and then dump whenever it reaches certain threshold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant