Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The matrix size of LAPACKE_sge_trans in lapacke_sgeqrt_work.c seems to be not correct #766

Closed
2 tasks done
sbite0138 opened this issue Nov 26, 2022 · 8 comments · Fixed by #768
Closed
2 tasks done

Comments

@sbite0138
Copy link

sbite0138 commented Nov 26, 2022

Description

When LAPACKE_sgeqrt is executed with row major, the following code is executed to convert the result obtained with column major to row major.

LAPACKE_sge_trans( LAPACK_COL_MAJOR, ldt, MIN(m,n), t_t, ldt_t, t,
ldt );

Here, ldt * MIN(m,n) should be the number of elements in t_t (= the number of elements in t), but this expression takes more values when ldt > nb nb > n.

This causes a segmentation fault when LAPACKE_sgeqrt is executed with row major and ldt > n.
Here is an example that causes the problem:

#include <stdio.h>
#include <time.h>
#include <lapacke.h>

int main(int argc, char *argv[])
{
    srandom(42);
    const int N = 16;
    const int NB = 4;
    const int LDA = N;
    const int LDT = 100 * N; // set the value so that LDT > N holds.

    // allocate A and T
    float *A = (float *)malloc(N * LDA * sizeof(float));
    float *T = (float *)malloc(N * LDT * sizeof(float));

    // initialize A
    for (int i = 0; i < N; i++)
    {
        for (int j = 0; j < N; j++)
        {
            A[i * LDA + j] = (float)random() / (float)RAND_MAX;
        }
    }

    // call sgeqrt
    LAPACKE_sgeqrt(LAPACK_ROW_MAJOR, N, N, NB, A, LDA, T, LDT);

    // print results
    for (int i = 0; i < N; i++)
    {
        for (int j = 0; j < N; j++)
        {
            printf("% 8f ", A[i * LDA + j]);
        }
        printf("\n");
    }
    printf("\n");
    for (int i = 0; i < NB; i++)
    {
        for (int j = 0; j < N; j++)
        {
            printf("% 8f ", T[i * LDT + j]);
        }
        printf("\n");
    }

    // free memory
    free(A);
    free(T);
    return 0;
}

Shouldn't this line look like this?

LAPACKE_sge_trans(LAPACK_COL_MAJOR, nb, MIN(m,n), t_t, ldt_t, t, ldt );

I'm thinking that other LAPACKE_*geqrt functions may contain similar problems.

Checklist

  • I've included a minimal example to reproduce the issue
  • I'd be willing to make a PR to solve this issue
@langou
Copy link
Contributor

langou commented Nov 27, 2022

Thanks. I agree this is an issue, and your proposed change would fix it. Can you please submit a pull request for LAPACKE_sgeqrt? And also LAPACKE_cgeqrt, LAPACKE_dgeqrt, and LAPACKE_zgeqrt?

@langou
Copy link
Contributor

langou commented Nov 27, 2022

I want to follow up on this issue, the proposed fix corrects the segmentation fault and this is a good thing.

However there is something that I do not like in how this LAPACKE routine works. The process, right now, is (1) the user works in row major, () they give an array T for the T matrix, () column major LAPACK works in the lower parts of T, (*) in order to return in row major, we transpose the array T.

However, we could ask ourselves whether it is OK for LAPACK to write in the strictly lower parts of T. As of now, I am reading the headers of DGEQRT and there is no promise to leave the strictly lower parts of T unchanged. So what LAPACKE does not break the headers. But we can wonder whether we want to make the header of DGEQRT more strict and guarantee that LAPACK will not write in the strictly lower parts of T.

All in all, I think maybe, in LAPACKE, we should (step 1) transpose T before calling Column Major LAPACK, (step 2) call Column Major LAPACK, and (step 3) transpose T after.

This is the same as what is done to the array A.

My preference would be do add a

LAPACKE_sge_trans(LAPACK_COL_MAJOR, nb, MIN(m,n), t_t, ldt_t, t, ldt );

before calling

LAPACK_dgeqrt( &m, &n, &nb, a_t, &lda_t, t_t, &ldt_t, work, &info );

(And one after too as proposed by @sbite0138.)

@sbite0138
Copy link
Author

Thank you for your response.

Can you please submit a pull request for LAPACKE_sgeqrt? And also LAPACKE_cgeqrt, LAPACKE_dgeqrt, and LAPACKE_zgeqrt?

My preference would be do add a

LAPACKE_sge_trans(LAPACK_COL_MAJOR, nb, MIN(m,n), t_t, ldt_t, t, ldt );
before calling

LAPACK_dgeqrt( &m, &n, &nb, a_t, &lda_t, t_t, &ldt_t, work, &info );
(And one after too as proposed by @sbite0138.)

I see. I agree with this method of implementation. I will create a PR using the method you suggested.

@sbite0138
Copy link
Author

Sorry for the delay. I created a pull request.
While reading the code, I felt that the value of ldt_t needed to be changed as well. Currently ldt_t is defined as follows:

lapack_int ldt_t = MAX(1,ldt);

Here, ldt is the leading dimension of t, and since t is a row major, ldt is the "horizontal" length.
On the other hand, since ldt_t is the leading dimension of t_t and t_t is col major, this value should be a "vertical" length.

I think the value of ldt_t should be as follows:

        lapack_int ldt_t = MAX(1,nb);

The pull request also reflects the above change, but if I have misunderstood, I would appreciate it if you could point this out to me.

@langou
Copy link
Contributor

langou commented Nov 28, 2022

Hi @sbite0138

I think we want to leave the line

lapack_int ldt_t = MAX(1,ldt);

@scr2016: can you pitch in?

Julien.

@sbite0138
Copy link
Author

sbite0138 commented Dec 7, 2022

Hi @langou

I am very sorry for the delay in replying.

I think we want to leave the line

OK, I have pushed back the changes regarding ldt_t.
If you don't mind, may I ask the reason why the change should not be made?

@langou
Copy link
Contributor

langou commented Dec 23, 2022

If you don't mind, may I ask the reason why the change should not be made?

If T has a leading dimension, (meaning ldt is strictly larger than nb) and you write

lapack_int ldt_t = MAX(1,nb);

then the code will write in some part of the ldt*n array T that you do not want it to write. It will work on the nb*n first entries of T. (It's more complicated than that because we have a sequence of upper nb-by-nb triangle, but that's the idea.) What you want it to do is to work on [0 to nb-1], then [ldt to ldt+nb-1], then [2*ldt to 2*ldt+nb-1], etc. That's the same nb*n entries. But not in the same positions in ldt*n array T.

This is only when ldt>nb.

Julien.

@sbite0138
Copy link
Author

Thank you for your response. I apologize for the late reply.

then the code will write in some part of the ldt*n array T that you do not want it to write. it will work on the nb*n first entries of T.

Just to confirm, does this mean that any write to T_t will also affect the original T array?
From my understanding, since ldt_t is used as the leading dimension for T_t and T_t is newly malloc'd (not a pointer to T), a write to T_t should not impact the original T array. Therefore, it seems unnecessary to set ldt_t to MAX (1,ldt).

Please let me know if there are any other considerations that I am missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants