Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add file splitting to evm:db tasks #17652

Closed

Conversation

NickLaMuro
Copy link
Member

@NickLaMuro NickLaMuro commented Jun 28, 2018

Built off of #17549
Update: Now also built off of #17798

Provides support for evm:db:dump:* tasks (will support evm:db:backup as well) to have the files split into chunks automatically by passing in the --byte-count flag to those tasks.

TODO

  • Vagrantfile and test script for testings these live (in progress)
  • Setup :remote tasks to use splitting with NFS and SMB end points
    • (want to have test vagrant env in place for this, see above)
    • This will be easy to implement, however, and it will just be adding the additional_opts code for the :remote tasks as well (see last commit)

Links

Steps for Testing/QA

I currently am using the above prerequisites locally to test (probably a gist with a Vagrantfile plus test script). Will provide detailed instructions soon.

UPDATE: Vagrant environment for testing this code, with a README on how to use it, can be found here:

https://gist.github.com/NickLaMuro/8438015363d90a2d2456be650a2b9bc6

@miq-bot miq-bot added the wip label Jun 28, 2018
@@ -0,0 +1,190 @@
#!/usr/bin/env ruby
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather reuse the manageiq/tools directory than introduce a new util dir.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This util dir already existed, pretty sure...

https://github.com/ManageIQ/manageiq/tree/d9dd44b/lib/manageiq/util

But also, this is from #17549 so I would prefer comments on file_splitter.rb would be done there. This is just making use of it because I need to keep moving forward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, nvm... Thanks misread this as /util. This is fine.

@@ -0,0 +1,71 @@
require 'net/ftp'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just live in /lib, IMO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not opposed to it, just was putting it in close proximity to the script that was using it, and also the namespace doesn't hurt avoiding potential conflicts (FtpLib is not my most creative name...).

end
end
instance_exec &block if block_given?
end.tap { |opts| opts.delete_if { |k,v| v.nil? } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If more_core_extensions is available, this can be simplified to opts.delete_nils

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, okay, I will look into that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

opt :uri_username, "Destination depot username", :type => :string
opt :uri_password, "Destination depot password", :type => :string
when :exclude_table_data
opt :"exclude-table-data", "Tables to exclude data", :type => :strings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this hyphenization. Trollop already auto-hypenates things.

Copy link
Member Author

@NickLaMuro NickLaMuro Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trollop already auto-hypenates things.

Oh, I guess I didn't know that or rather didn't pay enough attention when I wrote this originally.

That said, I might have done this on purpose because PostgresAdmin passes this option down to pg_dump via AwesomeSpawn, and that does not auto-hypenate, if I am not mistaken...

... Nick goes to check...

Copy link
Member Author

@NickLaMuro NickLaMuro Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, looks like AwesomeSpawn does convert from underscore to dash, so I could have avoided this in the first place. Not sure what I was thinking originally, but that was a month ago...

Not sure if I am wanting to change this in both PostgresAdmin and here at the moment. Might make a followup PR for addressing this specifically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things of note about this (since it is a good spot where I can leave this comment, reference it, and no one will get on my case about it being a 📖...)

  • So Trollop does auto-hyphenate as mentioned, but it does not do that when you pass the flag option directly.
  • The above is an important distinction because we are auto generating our flags, even our short ones. Those short_opts, because of the EvmDba.with_options changes might change as we add/remove new opts over time.
  • Because of the short opt generation, we should probably consider one of following:
    • In internal tools, never use short opts when calling out to these tasks from elsewhere (i.e. appliance_console)
    • Hard coding our short opts
    • Disabling our short opts

Also, because I am a "troll", this is why I just use optparse. :trollface:

opt :password, "Password", :type => :string
opt :hostname, "Hostname", :type => :string
opt :dbname, "Database name", :type => :string
opt :"exclude-table-data", "Tables to exclude data", :type => :strings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nvm...do you know why it's like this to begin with?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left my rational in #17652 (comment) , but will double check on that AwesomeSpawn bit and get back to you on this.

@NickLaMuro NickLaMuro force-pushed the file_split_in_evm_database_ops branch from f0605b8 to 988b277 Compare June 28, 2018 23:09
split_opts = [
[:byte_count, additional_opts[:byte_count]],
[nil, "-"],
[nil, db_opts[:local_file]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified to

      split_opts = [
        additional_opts.slice(:byte_count),
        "-",
        db_opts[:local_file]
      ]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I was under the impression that it had to be a fully associative array. I will look into doing this.

def initialize(options = {})
@input_file = options[:input_file] || ARGF
@input_filename = options[:input_filename]
@byte_count = options[:byte_count] || 10_485_760
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer 10.megabytes for readability (or if not ActiveSupport, 10 * 1024 * 1024 # 10.megabytes)


bytes = match[:BYTE_NUM].to_i
if match[:BYTE_QUALIFIER]
bytes *= { "k" => 1024, "m" => 1_048_576 }[match[:BYTE_QUALIFIER].downcase]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a method in ActiveSupport that does this for you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #17549

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity's sake, I don't think it does.

@NickLaMuro NickLaMuro force-pushed the file_split_in_evm_database_ops branch from 988b277 to 56b98fe Compare July 16, 2018 00:59
@NickLaMuro NickLaMuro force-pushed the file_split_in_evm_database_ops branch from 56b98fe to 83c2d8c Compare July 18, 2018 01:58
@NickLaMuro NickLaMuro changed the title [WIP] Add file splitting to evm:db tasks Add file splitting to evm:db tasks Jul 18, 2018
@NickLaMuro
Copy link
Member Author

Need to fix tests, but have updated the QA steps in the PR description with a set of instructions on how to test all of the evm:db:backup:* and evm:db:dump:* tasks.

Probably have a few more TODO's with that script to call it good, but that should be a good start to prove that what I have here is working.

@miq-bot
Copy link
Member

miq-bot commented Aug 1, 2018

This pull request is not mergeable. Please rebase and repush.

In EvmDatabaseOps, this change makes it so only the S3 mount will
attempt an upload/download on the necessary actions, and the rest will
remain as they were.
Splits files based on a given size, similar to `split`.  Uses number
based filename increments instead of letter based (which is the default
in `split`)
A library for shared Net::FTP functions that can be used throughout the
ManageIQ project.  Extracted from FileDepotFtp.
Allows for sending both anonymous and authorized FTP uploads directly
from the script.  Splits are done in memory, so the file never is
created on the host machine, which is ideal for large backups and such
that are being split up.

Spec for testing this functionality are done via a live FTP (vsftpd)
server setup with the following config:

    listen=NO
    listen_ipv6=YES

    local_enable=YES
    local_umask=022
    write_enable=YES
    connect_from_port_20=YES

    anonymous_enable=YES
    anon_root=/var/ftp/pub
    anon_umask=022
    anon_upload_enable=YES
    anon_mkdir_write_enable=YES
    anon_other_write_enable=YES

    pam_service_name=vsftpd
    userlist_enable=YES
    userlist_deny=NO
    tcp_wrappers=YES

Running on the 192.168.50.3 IP address.  Full vagrant setup for the
above can be found here:

    https://gist.github.com/NickLaMuro/c883a997c7ae943dd684bccd469cea43

Put the `Vagrantfile` into a directory and run:

    $ vagrant up

(with `vagrant` installed on that machine, of course)

The specs will only run if the user specifically targets the tests using
a tag flag:

    $ bundle exec rspec --tag with_real_ftp
The method `.backup` being called to PostgresAdmin was incorrect, and
should have been `.backup_pg_dump` from the beginning.
There were a lot of repeated options being defined for many of the tasks
in evm_dba.rake, so this DRYs up those definitions and defines them in
one spot.

The following tasks have been updated as a result:

  - `evm:db:gc`
  - `evm:db:region`
  - `evm:db:backup:local`
  - `evm:db:backup:remote`
  - `evm:db:dump:local`
  - `evm:db:dump:remote`
  - `evm:db:restore:local`
  - `evm:db:restore:remote`

If you run the following now:

    $ bin/rake [TASK] -- --help

With each of the above tasks, before and after this commit, the help
documentation should remain consitent throughout.
Adds the ability to pass the `--byte-count` (`-b`) to the following rake
tasks:

    - evm:db:backup:local
    - evm:db:backup:remote
    - evm:db:dump:local
    - evm:db:dump:remote

Adds additional options to the `EvmDba.with_options` method, and adds
the `EvmDba.collect_additional_opts` method, which pulls out the split
params from the options hash into it's own hash.  Used by EvmDatabaseOps
to determine if piping through `file_splitter.rb` is necessary.
@NickLaMuro
Copy link
Member Author

I should have removed any of the merge conflicts, so not sure why the bot hasn't removed the label.

Updated to include a bugfix that I need for splits to work: #17798

@miq-bot
Copy link
Member

miq-bot commented Aug 3, 2018

Checked commits NickLaMuro/manageiq@e8bf151~...fa245f0 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
7 files checked, 7 offenses detected

lib/evm_database_ops.rb

lib/manageiq/util/file_splitter.rb

lib/tasks/evm_dba.rake

@NickLaMuro
Copy link
Member Author

Test failures are from #17798 since I was not really designing those changes with specs in mind. Once I have the fixes in place for there, I will rebase them into this branch.

@miq-bot
Copy link
Member

miq-bot commented Aug 10, 2018

This pull request is not mergeable. Please rebase and repush.

@NickLaMuro NickLaMuro changed the title Add file splitting to evm:db tasks [WIP] Add file splitting to evm:db tasks Aug 14, 2018
@NickLaMuro
Copy link
Member Author

While I have closed most of the PRs surrounding this one with the following rational:

#17798 (comment)

I am going to leave this one open for the moment as I will probably open a similar one to integrate my new changes in. Most likely I won't try to reuse this one as I don't think it is worth the effort at this point, but until the replacement is put together (:soon:), I am going to leave this open for the time being to make sure I covered everything I did in this one.

@NickLaMuro
Copy link
Member Author

Just realized I still have this open. V2 located here:

#17894

Closing...

@NickLaMuro NickLaMuro closed this Aug 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants