Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Implement MersenneTwisterIntegerGenerator #769

Closed
wants to merge 1 commit into from

Conversation

localheinz
Copy link
Member

@localheinz localheinz commented Sep 19, 2023

What is the reason for this PR?

  • implements a MersenneTwisterIntegerGenerator

Follows #771.

πŸ’β€β™‚οΈ Related to #753, perhaps this makes sense to extract (and of course use, that's why this is a draft for now)

Author's checklist

Summary of changes

Review checklist

  • All checks have passed
  • Changes are added to the CHANGELOG.md
  • Changes are approved by maintainer

@localheinz localheinz added the enhancement New feature or request label Sep 19, 2023
@localheinz localheinz self-assigned this Sep 19, 2023
/**
* @experimental
*/
final class MersenneTwisterIntegerGenerator implements SeedableIntegerGenerator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming of this class is probably not ideal. Specifically this is PHP's global instance of Mt19937 (colloquially β€œMersenne Twister”). There's also https://www.php.net/manual/en/class.random-engine-mt19937.php which is the OO variant that does not rely on global state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TimWolla

Yes, it is.

For PHP 8.3, an implementation that uses Random\Randomizer could look like this:

<?php

declare(strict_types=1);

namespace Faker\Core\Generator;

use Random\Randomizer;

/**
 * @experimental
 */
final class RandomIntegerGenerator implements IntegerGenerator
{
    private Randomizer $randomizer;

    public function __construct(Randomizer $randomizer)
    {
        $this->randomizer = $randomizer;
    }

    public function integer(): int
    {
        return $this->randomizer->nextInt();

    }

    public function integerBetween(int $minimum, int $maximum): int
    {
        if ($minimum > $maximum) {
            throw new \InvalidArgumentException(sprintf(
                'Minimum value %d should not be greater than the maximum value %d.',
                $minimum,
                $maximum,
            ));
        }

        return $this->randomizer->getInt($minimum, $maximum);
    }

    public function largestInteger(): int
    {
        return PHP_INT_MAX;
    }
}

What do you think?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on the expected semantics of the various methods, the implementation is broken. ->nextInt() is not guaranteed to return values up to PHP_INT_MAX, so if you rely on that, you will be disappointed.

Generally I'd say that using ->nextInt() is an anti-pattern (similarly to calling mt_rand() without parameters; the method only exists for drop-in compatibility of the OO API). Retrieving an arbitrary positive integer is very rarely useful. You generally know what range you want to retrieve and ->getInt(0, PHP_INT_MAX) would at least be explicit about the fact that you do not care.

Almost all uses of mt_rand() without parameters are better replaced by random_bytes() (possibly combined with bin2hex()).

Comment on lines 12 to 15
/**
* Seeds the number generator.
*/
public function seed(int $value): void;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing the reseeding of an existing Generator is probably not a good idea. The native engines (PHP 8.2+) intentionally support constructor-seeding only to make their behavior as predictable as possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TimWolla

Thank you for your feedback!

With this pull request I attempt to move the generation of (random) integers into a single place so we can replace the underlying strategy as needed or permitted on the system on which fakerphp/faker runs.

fakerphp/faker currently allows seeding, that's why I extracted a separate interface.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fakerphp/faker currently allows seeding, that's why I extracted a separate interface.

In this case it would probably be cleaner to completely replace the β€œgenerator” instance whenever reseeding happens. But I don't know if that's feasible with the current architecture / user expectations.

As I said over in #752, it would make sense to embrace the native OO API. As you're breaking compatibility in a new major anyway you might as well go the extra mile. YMMV.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TimWolla

As you may have noticed in #752, there is quite some resistance towards even adopting a PHP version support policy, let alone dropping support for PHP versions without active support. Dropping support here for PHP 8.1 is out of the question right now.

The aim here is to

  • extract an interface for generating (random) integers
  • extracting an implementation of that interface that reflects the current functionality
  • making room for alternative implementations of that interface

Everything that is broken in fakerphp/faker:1.23 and fakerphp/faker:2.0 right now by using

  • mt_rand() with arguments
  • mt_rand() without arguments
  • mt_getrandmax()
  • mt_srand()

is totally fine for every single user of this package - they have had time to get used to it.

Again, the aim here is to move forward in small steps.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I would also be somewhat resistant to dropping support for a PHP version if the increased minimum requirement does not provide a value-add. Adopting the new random API would be a value-add, though.

I don't have any personal or professional interest in Faker, I no longer use it (I've contributed some stuff back in 2013 and 2014). I'm just here commenting from the perspective of one of the randomness maintainers in PHP itself, hoping it would be useful to not make avoidable mistakes in the design of the new major version (e.g. the mt_rand() without parameters one).

@localheinz localheinz force-pushed the feature/generator branch 2 times, most recently from 9732f3a to 11289af Compare September 20, 2023 15:48
@localheinz localheinz marked this pull request as ready for review September 22, 2023 09:30
@localheinz localheinz removed their assignment Sep 22, 2023
@stale
Copy link

stale bot commented Oct 15, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 1 week if no further activity occurs. Thank you for your contributions.

Copy link

stale bot commented Dec 15, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 1 week if no further activity occurs. Thank you for your contributions.

@localheinz localheinz closed this Jan 2, 2024
@localheinz localheinz deleted the feature/generator branch January 2, 2024 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lifecycle/stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants