Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing tests when using system glpk #29493

Closed
kliem opened this issue Apr 10, 2020 · 67 comments
Closed

Failing tests when using system glpk #29493

kliem opened this issue Apr 10, 2020 · 67 comments

Comments

@kliem
Copy link
Contributor

kliem commented Apr 10, 2020

At the moment there are two failing doctests, when using the glpk from the system, e.g. on ubuntu eoan https://github.com/mkoeppe/sage/runs/542655821

sage -t src/sage/numerical/backends/glpk_backend.pyx
**********************************************************************
File "src/sage/numerical/backends/glpk_backend.pyx", line 2287, in sage.numerical.backends.glpk_backend.GLPKBackend.print_ranges
Failed example:
    p.print_ranges()
Expected:
    glp_print_ranges: optimal basic solution required
    1
Got:
    1

This doctest was mentioned before in #29317 with a suggestion for a fix.

 sage -t src/sage/libs/glpk/error.pyx
**********************************************************************
File "src/sage/libs/glpk/error.pyx", line 100, in sage.libs.glpk.error.setup_glpk_error_handler
Failed example:
    res = p.solve()
Expected:
          0: obj = ...
Got:
    <BLANKLINE>

The problem is that we have doctests that rely on error-recovery behavior added by a custom patch [#20710 comment:18],
which wasn't accepted by upstream. (The doctest for the patch was added in #20832.)

The present ticket fixes the failures by

  • adding explicit input validation in several functions in the Cython wrappers so that the GLPK error handler is not reached;
  • changing the default optimization mode from glp_intopt to glp_simplex_then_intopt, which is more robust;
  • disabling the test for the error-recovery behavior, which provokes a crash with unpatched GLPK.

CC: @jdemeyer @mkoeppe @orlitzky @jpflori @embray @sagetrac-gouezel @kiwifb @dcoudert @dimpase

Component: packages: standard

Keywords: glpk, patches

Author: Michael Orlitzky, Matthias Koeppe

Branch/Commit: 8acdf34

Reviewer: Matthias Koeppe, Michael Orlitzky

Issue created by migration from https://trac.sagemath.org/ticket/29493

@kliem kliem added this to the sage-9.1 milestone Apr 10, 2020
@kliem
Copy link
Contributor Author

kliem commented Apr 10, 2020

comment:1

What we could do is add a new testing flag style # optional - sage-glpk (and for any other package).
This would than only be tested when sage build it's own version of the package.

This could also be used for #29092 and make the tests nicer for #29417.

@kliem
Copy link
Contributor Author

kliem commented Apr 14, 2020

comment:2

We should/will have a general discussion about things like this on sage-devel after 9.1 is released.

@kliem kliem modified the milestones: sage-9.1, sage-9.2 Apr 14, 2020
@jhpalmieri
Copy link
Member

comment:3

How about a patch like this? Simpler than adding a new tag.

diff --git a/src/sage/doctest/parsing.py b/src/sage/doctest/parsing.py
index ebf7555106..503ad17e6f 100644
--- a/src/sage/doctest/parsing.py
+++ b/src/sage/doctest/parsing.py
@@ -44,6 +44,7 @@ optional_regex = re.compile(r'(py2|py3|long time|not implemented|not tested|know
 # which has not been patched, we need to ignore that message.
 # See :trac:`29317`.
 glpk_simplex_warning_regex = re.compile(r'(Long-step dual simplex will be used)')
+glpk_print_ranges_regex = re.compile(r'(glp_print_ranges: optimal basic solution required)')
 find_sage_prompt = re.compile(r"^(\s*)sage: ", re.M)
 find_sage_continuation = re.compile(r"^(\s*)\.\.\.\.:", re.M)
 find_python_continuation = re.compile(r"^(\s*)\.\.\.([^\.])", re.M)
@@ -1077,6 +1078,7 @@ class SageOutputChecker(doctest.OutputChecker):
         """
         got = self.human_readable_escape_sequences(got)
         got = glpk_simplex_warning_regex.sub('', got)
+        got = glpk_print_ranges_regex.sub('', got)
         if isinstance(want, MarkedOutput):
             if want.random:
                 return True
diff --git a/src/sage/libs/glpk/error.pyx b/src/sage/libs/glpk/error.pyx
index f7046b3ec4..26faf1e8d9 100644
--- a/src/sage/libs/glpk/error.pyx
+++ b/src/sage/libs/glpk/error.pyx
@@ -98,7 +98,7 @@ def setup_glpk_error_handler():
         sage: p.add_constraint(x >= 0)
         sage: p.set_objective(x + y)
         sage: res = p.solve()
-              0: obj = ...
+        ...
         sage: res  # rel tol 1e-15
         2.4
     """
diff --git a/src/sage/numerical/backends/glpk_backend.pyx b/src/sage/numerical/backends/glpk_backend.pyx
index b1acf12a28..4b2d12eddc 100644
--- a/src/sage/numerical/backends/glpk_backend.pyx
+++ b/src/sage/numerical/backends/glpk_backend.pyx
@@ -2285,7 +2285,6 @@ cdef class GLPKBackend(GenericBackend):
             sage: import sage.numerical.backends.glpk_backend as backend
             sage: p.solver_parameter(backend.glp_simplex_or_intopt, backend.glp_simplex_only)
             sage: p.print_ranges()
-            glp_print_ranges: optimal basic solution required
             1
             sage: p.solve()
             0

@kliem
Copy link
Contributor Author

kliem commented Apr 29, 2020

comment:4

It's certainly simpler than adding the tag. However there are two problems:

  • In upgrade glpk to 4.60 #20710 people agreed that this patch is necessary. Is it ok to ignore it, if the system package is present? Do we warn the user that this package has some API problems with sage? (This is a more general issue, as there should be some agreement on how much failure do we accept when a system package is used.)
  • Isn't your approach the same as removing those tests. That would lead to us never noticing when upgrade glpk to 4.60 #20710 breaks (unless the patch causes build failure of course).

The idea of the tag is that there might be tests that only pass when the sage-build package is used, e.g. #29092. With the tag we would properly document that this works with sage install of the package, but it might not otherwise. But there are probably better ideas out there.

@jhpalmieri
Copy link
Member

comment:5

To me there are two issues:

  • what are these doctests trying to test? For the one in glpk_backend.pyx, is the extra message useful or can it be ignored? I don't know the answers, by the way.
  • re whether the patch is necessary: that should have been a discussion in the ticket which set up the spkg-configure.m4 file for glpk, which allows users to use the system version instead of Sage's. Should build/pkgs/glpk/spkg-configure.m4 somehow test this feature of Sage's glpk and not allow others?

Adding a tag to the doctest so it is only run when Sage's glpk is used also ignores whether the patch is necessary, so it doesn't really solve the problem, except to test whether Sage's glpk built correctly. Since the momentum is toward using more system packages, people who know about glpk (i.e., not me) need to decide whether to continue allowing use of unpatched system versions. If so, then in my opinion we should test Sage's modifications to glpk in its own test suite (spkg-check) and fix the doctests so they work with all versions.

@kliem
Copy link
Contributor Author

kliem commented Apr 29, 2020

comment:6

I agree that moving the second test makes more sense than an extra flag. (Unless of course people decide that this is unacceptable.)

The first tests should stay there though, as I don't think it is there to test the warning message. So your fix would be just fine.

@orlitzky
Copy link
Contributor

comment:7

What problem was this patch trying to solve? The GLPK documentation states,

If some GLPK API routine detects erroneous or incorrect data passed by the application program,
it writes appropriate diagnostic messages to the terminal and then abnormally terminates
the application program. In most practical cases this allows to simplify programming by avoiding
numerous checks of return codes. Thus, in order to prevent crashing the application program should
check all data, which are suspected to be incorrect, before calling GLPK API routines.

Should note that this kind of error handling is used only in cases of incorrect data passed by
the application program. If, for example, the application program calls some GLPK API routine
to read data from an input file and these data are incorrect, the GLPK API routine reports about
error in the usual way by means of the return code.

Given that the doctest in sage/libs/glpk/error.pyx has to go to great lengths to crash glpk...

sage: cython('''
....: # distutils: libraries = glpk z gmp
....: from cysignals.signals cimport sig_on, sig_off
....: from sage.libs.glpk.env cimport glp_term_out
....:
....: sig_on()
....: glp_term_out(12345)  # invalid value
....: sig_off()
....: ''')
Traceback (most recent call last):

what sort of real problems are we expecting this to catch during normal sage usage? Is there any place that a user passes data directly to glpk that we can't inspect/preprocess first, to ensure that it's correct? If not, maybe we should just crash sage when glpk crashes and drop the patch, because any crash of glpk is a low-level sage programming error.

@orlitzky
Copy link
Contributor

comment:8

For example, one of the GLPK crashes that we catch and throw is,

sage: from sage.numerical.backends.generic_backend import get_solver
sage: p = get_solver(solver="GLPK")
sage: p.variable_upper_bound(2)
...
GLPKError: glp_get_col_ub: j = 3; column number out of range

The problem? We assume that the index 2 was valid, and use GLPK crashing as an indication that it isn't. Instead, we should just check that the index is valid (i.e. use the API correctly):

sage: from sage.numerical.backends.generic_backend import get_solver
sage: p = get_solver(solver="GLPK")
sage: p.ncols()
0
sage: p.add_variable()
0
sage: p.add_variable()
1
sage: p.add_variable()
2
sage: p.ncols()
3
sage: p.variable_upper_bound(2)
<no crash>

Tada, one down.

@kliem
Copy link
Contributor Author

kliem commented May 30, 2020

comment:9

Maybe someone involved #20710 can help us figure out, if we can drop this patch.

@orlitzky
Copy link
Contributor

comment:10

Running git grep GLPKError shows all of the instances that we need to worry about. And all except two look like missing bounds checks that can be turned into aValueError.

That leaves:

.. WARNING::

    Sage uses GLPK's ``glp_intopt`` to find solutions.
    This routine sometimes FAILS CATASTROPHICALLY
    when given a system it cannot solve. (:trac:`12309`.)
    Here, "catastrophic" can mean either "infinite loop" or
    segmentation fault. Upstream considers this behavior
    "essentially innate" to their design, and suggests
    preprocessing it with ``glp_simplex`` first.
    Thus, if you suspect that your system is infeasible,
    set the ``preprocessing`` option first.

EXAMPLES::

    sage: lp = MixedIntegerLinearProgram(solver = "GLPK")
    sage: v = lp.new_variable(nonnegative=True)
    sage: lp.add_constraint(v[1] +v[2] -2.0 *v[3], max=-1.0)
    sage: lp.add_constraint(v[0] -4.0/3 *v[1] +1.0/3 *v[2], max=-1.0/3)
    sage: lp.add_constraint(v[0] +0.5 *v[1] -0.5 *v[2] +0.25 *v[3], max=-0.25)
    sage: lp.solve()
    0.0
    sage: lp.add_constraint(v[0] +4.0 *v[1] -v[2] +v[3], max=-1.0)
    sage: lp.solve()
    Traceback (most recent call last):
    ...
    GLPKError: Assertion failed: ...
    sage: lp.solver_parameter("simplex_or_intopt", "simplex_then_intopt")
    sage: lp.solve()
    Traceback (most recent call last):
    ...
    MIPSolverException: GLPK: Problem has no feasible solution

I've included the WARNING block from just prior to the doctest, because it tells us how to fix this problem. Instead of an unsafe default, we should enable preprocessing but let the user disable it if he thinks he knows what he's doing (and is willing to crash sage if he's wrong).

And the other instance:

cpdef eval_tab_row(self, int k):
    r"""
    Computes a row of the current simplex tableau.
    ...
    try:
        sig_on()            # to catch GLPKError
        i = glp_eval_tab_row(self.lp, k + 1, c_indices, c_values)
        sig_off()
    except GLPKError:
        raise MIPSolverException('GLPK: basis factorization does not exist; or variable must be basic')

I haven't computed a tableau in about a decade. Is that something we can check before we call glp_eval_tab_row()?

@orlitzky
Copy link
Contributor

comment:11

Replying to @orlitzky:

I haven't computed a tableau in about a decade. Is that something we can check before we call glp_eval_tab_row()?

Yup. I'm in the process of fixing all of these. From the GLPK docs:

Synopsis:
int glp_bf_exists(glp_prob *P);
Comments:
... So before calling any API routine, which uses the basis factorization, the application program must make sure (using the routine glp_bf_exists) that the factorization exists and therefore available for computations.

@orlitzky
Copy link
Contributor

orlitzky commented Jun 1, 2020

Branch: u/mjo/ticket/29493

@orlitzky
Copy link
Contributor

orlitzky commented Jun 1, 2020

Commit: f5edb44

@orlitzky
Copy link
Contributor

orlitzky commented Jun 1, 2020

Author: Michael Orlitzky

@orlitzky
Copy link
Contributor

orlitzky commented Jun 1, 2020

comment:12

Here's a preliminary branch. I haven't run a ptestlong yet, and you can definitely crash sage if you ignore the warnings and disable the safety net. But I've basically eliminated the need for our custom GLPK error handling, and thus the need for the rejected patch to reset the error state.


New commits:

e39cffdTrac #29493: add sanity checks for GLPK's variable bound methods.
3fec617Trac #29493: add missing bounds checks for other GLPK backend methods.
3a1b741Trac #29493: add bounds checks to GLPK's add_linear_constraint() method.
3b75fc0Trac #29493: add bounds checks to GLPK's get_col_dual() method.
51824a6Trac #29493: verify input to GLPK row/column tableaux methods.
c9507a0Trac #29493: make simplex_then_intopt the default GLPK method.
82a7527Trac #29493: remove doctest that intentionally crashes GLPK.
7d8db10Trac #29493: remove glpk.error extension module.
f5edb44Trac #29493: drop rejected GLPK patch.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 2, 2020

Changed commit from f5edb44 to 1cc9f61

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 2, 2020

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

20e92efTrac #29493: make simplex_then_intopt the default GLPK method.
83ea75aTrac #29493: remove doctest that intentionally crashes GLPK.
c9b9a24Trac #29493: remove glpk.error and glpk.env extension modules.
1cc9f61Trac #29493: drop GLPK patch that was rejected by upstream.

@orlitzky
Copy link
Contributor

orlitzky commented Jun 2, 2020

comment:14

A ptestlong passes now, and I've cleaned up the commit messages. This is the only pseudo-regression introduced, if you ignore the warning about changing the default solver to intopt only:

sage: lp = MixedIntegerLinearProgram(solver = "GLPK")
sage: v = lp.new_variable(nonnegative=True)
sage: lp.add_constraint(v[1] +v[2] -2.0 *v[3], max=-1.0)
sage: lp.add_constraint(v[0] -4.0/3 *v[1] +1.0/3 *v[2], max=-1.0/3)
sage: lp.add_constraint(v[0] +0.5 *v[1] -0.5 *v[2] +0.25 *v[3], max=-0.25)
sage: lp.add_constraint(v[0] +4.0 *v[1] -v[2] +v[3], max=-1.0)
sage: lp.solver_parameter("simplex_or_intopt", "intopt_only")
sage: lp.solve()
Assertion failed: col->lb < col->ub
Error detected in file npp/npp5.c at line 528
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-8-34bddfdcab75> in <module>()
----> 1 lp.solve()

/home/mjo/src/sage.git/local/lib/python3.7/site-packages/sage/numerical/mip.pyx in sage.numerical.mip.MixedIntegerLinearProgram.solve (build/cythonized/sage/numerical/mip.c:15418)()
   2249         """
   2250         if log is not None: self._backend.set_verbosity(log)
-> 2251         self._backend.solve()
   2252         return self._backend.get_objective_value()
   2253 

/home/mjo/src/sage.git/local/lib/python3.7/site-packages/sage/numerical/backends/glpk_backend.pyx in sage.numerical.backends.glpk_backend.GLPKBackend.solve (build/cythonized/sage/numerical/backends/glpk_backend.c:9837)()
   1136         if ((self.simplex_or_intopt == glp_intopt_only)
   1137             or (self.simplex_or_intopt == glp_simplex_then_intopt) and (solution_status != GLP_UNDEF) and (solution_status != GLP_NOFEAS)):
-> 1138             sig_on()
   1139             solve_status = glp_intopt(self.lp, self.iocp)
   1140             solution_status = glp_mip_status(self.lp)

RuntimeError: Aborted

@kliem
Copy link
Contributor Author

kliem commented Jun 2, 2020

comment:15

I'm running tests here:

https://github.com/kliem/sage-test-27122/actions

e.g. the ubuntu eoan test will be here https://github.com/kliem/sage-test-27122/runs/731034181

(the naming of the repository is confusing, but I'm really only testing u/mjo/ticket/29493 on top of #29757).

@mkoeppe
Copy link
Member

mkoeppe commented Jun 9, 2020

Changed commit from 1cc9f61 to 066b1e8

@mkoeppe
Copy link
Member

mkoeppe commented Jun 9, 2020

comment:43

Replying to @orlitzky:

If I'm writing a new sage package and if I rely on being able to catch GLPKError

... this appears to be a documentation issue. I have added a commit that addresses it.


New commits:

83629f5sage.libs.glpk.error: Add documentation discouraging users from using GLPKError
066b1e8setup_glpk_error_handler: Disable doctest that tests the GLPK error recovery patch

@mkoeppe
Copy link
Member

mkoeppe commented Jun 9, 2020

comment:44

The two removal commits are now on #29829.

@mkoeppe
Copy link
Member

mkoeppe commented Jun 9, 2020

Changed author from Michael Orlitzky to Michael Orlitzky, Matthias Koeppe

@mkoeppe
Copy link
Member

mkoeppe commented Jun 9, 2020

Reviewer: Matthias Koeppe, ...

@mkoeppe
Copy link
Member

mkoeppe commented Jun 9, 2020

comment:46

Tests run at https://github.com/mkoeppe/sage/actions/runs/129317169

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 9, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

9427aa1src/sage/libs/glpk/error.pyx: Fixup doctest markup

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 9, 2020

Changed commit from 066b1e8 to 9427aa1

@mkoeppe
Copy link
Member

mkoeppe commented Jun 9, 2020

comment:48

New test at https://github.com/mkoeppe/sage/runs/755764195

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 9, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

28024acsrc/sage/libs/glpk/error.pyx: Make doctest more flexible

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 9, 2020

Changed commit from 9427aa1 to 28024ac

@mkoeppe
Copy link
Member

mkoeppe commented Jun 10, 2020

comment:50

New tests at https://github.com/mkoeppe/sage/actions/runs/130367945

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 10, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

8acdf34Fixup doctest

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 10, 2020

Changed commit from 28024ac to 8acdf34

@mkoeppe
Copy link
Member

mkoeppe commented Jun 10, 2020

comment:52

Passes tests now - see for example https://github.com/mkoeppe/sage/runs/756778388

@mkoeppe
Copy link
Member

mkoeppe commented Jun 10, 2020

comment:53

I have also tested with https://github.com/mkoeppe/sage-numerical-interactive-mip (which uses some of these methods).

I would be ready to set it to positive_review; but we need another reviewer for my (trivial) changes.

@orlitzky
Copy link
Contributor

comment:55
+    We install this error handler so that an error does not lead to
+    an immediate error exit of the process.  Instead, we raise a
+    ``GLPKError`` for the convenience of developers.

The sage process never exited, but the error left GLPK unusable from that point forward -- for example returning nothing instead of solving subsequent problems as in the ticket description. The error handler itself is fine; it's toggling the "there was an error" bit off afterwards that's causing all this trouble.

I still think it's perverse to leave around 150 lines of code with a docstring that says "don't use this," but where we are is not even locally optimal, so let's just do it.

@mkoeppe
Copy link
Member

mkoeppe commented Jun 10, 2020

comment:56

Replying to @orlitzky:

150 lines of code with a docstring that says "don't use this,"

... "for production".

@orlitzky
Copy link
Contributor

Changed reviewer from Matthias Koeppe, ... to Matthias Koeppe, Michael Orlitzky

@kliem
Copy link
Contributor Author

kliem commented Jun 10, 2020

comment:58

Thank you for doing this.

@vbraun
Copy link
Member

vbraun commented Jun 21, 2020

Changed branch from u/mkoeppe/ticket/29493 to 8acdf34

@vbraun vbraun closed this as completed in 55dc391 Jun 21, 2020
vbraun pushed a commit to vbraun/sage that referenced this issue Jun 16, 2024
    
The sage distribution patches GLPK to be able to recover from errors.
Upstream rejected this patch, said what it does is unsupported, and no
one else has ever adopted it. Sage works fine without it.

If someone really wants to add error recovery to GLPK, it has to be done
in a way that upstream does not object to. Otherwise, it cannot be done
reliably -- no linux distros or conda or homebrew are going to ship our
patch.

These changes date back to
sagemath#29493, and  this PR will close
sagemath#29829.
    
URL: sagemath#37801
Reported by: Michael Orlitzky
Reviewer(s): Matthias Köppe
vbraun pushed a commit to vbraun/sage that referenced this issue Jun 16, 2024
    
The sage distribution patches GLPK to be able to recover from errors.
Upstream rejected this patch, said what it does is unsupported, and no
one else has ever adopted it. Sage works fine without it.

If someone really wants to add error recovery to GLPK, it has to be done
in a way that upstream does not object to. Otherwise, it cannot be done
reliably -- no linux distros or conda or homebrew are going to ship our
patch.

These changes date back to
sagemath#29493, and  this PR will close
sagemath#29829.
    
URL: sagemath#37801
Reported by: Michael Orlitzky
Reviewer(s): Matthias Köppe
vbraun pushed a commit to vbraun/sage that referenced this issue Jun 17, 2024
    
The sage distribution patches GLPK to be able to recover from errors.
Upstream rejected this patch, said what it does is unsupported, and no
one else has ever adopted it. Sage works fine without it.

If someone really wants to add error recovery to GLPK, it has to be done
in a way that upstream does not object to. Otherwise, it cannot be done
reliably -- no linux distros or conda or homebrew are going to ship our
patch.

These changes date back to
sagemath#29493, and  this PR will close
sagemath#29829.
    
URL: sagemath#37801
Reported by: Michael Orlitzky
Reviewer(s): Matthias Köppe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants