memory and mbind problems in convGDX2MIF / reportFE #549

orichters · 2024-02-27T13:36:49Z

Running output.R -> single -> reporting.R runs out of memory with standard --mem=8000 setting, even --mem=12000 fails.
Reason seems to be that reportFE requires more memory
Using output.R -> single -> reporting, @orichters started output generation for several AMT testOneRegi runs. Result: Until 2023-12-22, it works, afterwards, it fails, see cd /p/projects/remind/modeltests/remind/output; grep oom-kill testOne*/log_output.txt
@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q found out it is gdx::readGDX(), triggered by vm_demFENonEnergySector:

> foo <- gdx::readGDX('./output/testOneRegi-Base-AMT_2024-02-16_22.08.20/fulldata.gdx', 'vm_demFENonEnergySector', field = 'l', restore_zeros = TRUE, react = 'silent')
> format(object.size(foo), units = 'auto')
[1] "3 Gb"
> bar <- gdx::readGDX('./output/testOneRegi-Base-AMT_2024-02-16_22.08.20/fulldata.gdx', 'vm_demFENonEnergySector', field = 'l', restore_zeros = FALSE, react = 'silent')
> format(object.size(bar), units = 'auto')
[1] "138.9 Kb"

According to @0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q, one cannot simply set restore_zeros = FALSE because we declare all variables over the general supersets (e.g., tall, all_enty), but usually define them only over subsets (t, entyFE). restore_zeros = FALSE reads in only data that is not zero, which is usually what we want. But if some axis of the data (e.g. all values in 2010, or everything fehos) happens to be entirely zero (usually that happens for specific gdxes only), that dimension of the resulting magpie object differs (e.g. c(2005, 2015, …) instead of c(2005, 2010, 2015, …)) and it is incompatible with all the rest of the objects we have. What would be needed is a function to expands a variable read in with restore_zeros = FALSE to the actual sets it should have. But that is a moving target, as it depends on all the other variables/parameters used in conjunction with it.
magpie_expand is not the solution

Some code to start with:

library(magclass)
t <- c(1995, 2005, 2015)
s <- c('AFR', 'CPA')
A <- dimSums(maxample('pop')[s,t,'A2'], 3)
B <- dimSums(maxample('pop')[s,t[-2],'B1'], 3)

The question is how to be able to sum, mbind etc. A and B.

The problem is that the dimensions each variable is defined (not declared) over in the GAMS code (e.g. t instead of tall, se2fe(entySE,entyFE,te) instead of all all_enty/all_enty/all_te combinations) is not pre-defined, and the problem depends on which other variables the variables is combined with (in magpie operations) and what data those variables have.

The text was updated successfully, but these errors were encountered:

orichters · 2024-02-27T13:37:02Z

code suggestion by @0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q to expand a to the joint size of a and b:

somebody_name_this_function <- function(a, b, fill = 0) {
    r <- new.magpie(cells_and_regions = union(getRegions(a), getRegions(b)),
                    years = sort(union(getYears(a, as.integer = TRUE), 
                                       getYears(b, as.integer = TRUE))),
                    names = union(getNames(a), getNames(b)),
                    fill = fill,
                    sets = names(dimnames(b)))
   
    r[getRegions(a),getYears(a),getNames(a)] <- a
    return(r)
}
B_ <- B[,t[-2],] %>% somebody_name_this_function(A)
A + B_

which returns:

An object of class "magpie"
, , 1
     t
i        y1995   y2005   y2015
  AFR 1105.333  696.44 1821.22
  CPA 2561.270 1429.53 3018.20

> system.time({
+     x <- vm_demFENonEnergySector <- readGDX(gdx, "vm_demFENonEnergySector",
+                                             field = "l", restore_zeros = TRUE,
+                                             react = "silent")[,t,]})
   user  system elapsed 
 12.134   4.380  16.553 
> format(object.size(x), units = 'auto')
[1] "1.5 Gb"

> system.time({
+     ref <- new.magpie(
+         cells_and_regions = readGDX(gdx, 'regi'),
+         years = readGDX(gdx, 't'),
+         names = quitte::cartesian(
+             readGDX(gdx, 'se2fe') %>% 
+                 mutate(x = paste(.data$all_enty, .data$all_enty1,
+                                  sep = '.')) %>% 
+                 pull('x'),
+             'indst',
+             readGDX(gdx, 'all_emiMkt')),
+         fill = 0,
+         sets = c('t', 'regi', 'entySE', 'entyFE', 'sector', 'emiMkt'))
+     
+     
+     
+     y <- vm_demFENonEnergySector <- readGDX(gdx, "vm_demFENonEnergySector",
+                                             field = "l", restore_zeros = FALSE,
+                                             react = "silent") %>% 
+         somebody_name_this_function(ref)})
   user  system elapsed 
  0.133   0.000   0.129 
> format(object.size(y), units = 'auto')
[1] "257.8 Kb"

orichters · 2024-02-27T13:38:05Z

I rather think of a one-sided function that restricts and expands x to the size of ref:

somebody_name_this_function <- function(x, ref, fill = 0) {
  r <- new.magpie(cells_and_regions = getRegions(ref),
                  years = getYears(ref),
                  names = getNames(x),
                  fill = fill,
                  sets = names(dimnames(ref)))
  r[intersect(getRegions(x), getRegions(r)), intersect(getYears(x), getYears(r)), getNames(x)] <- x
  return(r)
}

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q · 2024-02-27T14:35:55Z

"Nicht mein Zirkus, nicht meine Affen."

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q · 2024-02-27T16:12:03Z

region <- c("AAA", "BBB", "CCC")
t      <- paste0("y", c(2000, 2005, 2010))
name   <- c("foo", "bar", "bazz")

A <- new.magpie(cells_and_regions = region, years = t, names = name, fill = 1,
                sets = c("region", "t", "name"))

# spatial ----
## x smaller then ref ----
expect_equal(somebody_name_this_function(A[region[-2],,], A),
             `mselect<-`(A, region = region[2], value = 0))

## x larger then ref ----
expect_equal(somebody_name_this_function(A, A[region[-2],,]),
             A[region[-2],,])

# temporal ----
## x smaller then ref ----
expect_equal(somebody_name_this_function(A[,t[-2],], A),
             `mselect<-`(A, t = t[2], value = 0))

## x larger then ref ----
expect_equal(somebody_name_this_function(x = A, ref = A[,t[-2],]),
             A[,t[-2],])

# names ----
## x smaller then ref ----
expect_equal(somebody_name_this_function(A[,,name[-2]], A),
             `mselect<-`(A, name = name[2], value = 0))

## x larger then ref ----
expect_equal(somebody_name_this_function(A, A[,,name[-2]]),
             A[,,name[-2]])

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q · 2024-02-28T08:17:51Z

I found a (somewhat) reliable way to assess the size of the problem.

/p/projects/remind/modeltests/remind/output/SSP2EU-PkBudg1050-AMT_2024-02-24_01.16.37 (develop|?? M)$ /usr/bin/time -v Rscript -e 'format(object.size(gdx::readGDX("fulldata.gdx", "vm_demFENonEnergySector", field = "l", restore_zeros = FALSE)), units = "auto")' 2>&1 | grep -E '(\[|Maximum resident set size)'
[1] "66 Kb"
	Maximum resident set size (kbytes): 131152
/p/projects/remind/modeltests/remind/output/SSP2EU-PkBudg1050-AMT_2024-02-24_01.16.37 (develop|?? M)$ /usr/bin/time -v Rscript -e 'format(object.size(gdx::readGDX("fulldata.gdx", "vm_demFENonEnergySector", field = "l", restore_zeros = TRUE)), units = "auto")' 2>&1 | grep 
-E '(\[|Maximum resident set size)'
[1] "3 Gb"
	Maximum resident set size (kbytes): 9496012

So the returned object goes from 66 kb to 3 GB (with filtering for t it is 1.5 GB I believe), while peak memory demand increases by 9.3 GB.

Peak memory demand of convGDX2MIF() seems to be 16 GB currently:

$ sbatch --wrap "/usr/bin/time -v Rscript -e 'x <- remind2::convGDX2MIF(\"/p/projects/remind/modeltests/remind/output/SSP2EU-
PkBudg1050-AMT_2024-02-24_01.16.37/fulldata.gdx\")' 2>&1 | grep 'Maximum resident set size'" --output=foo.log --qos="priority" --mem=480
00 --mail-type=END
$ cat foo.log 
	Maximum resident set size (kbytes): 15919132

orichters · 2024-02-28T08:20:05Z

Peak memory demand of convGDX2MIF() seems to be 16 GB currently:

That fits my experience that using --mem=16000 was sufficient.

fbenke-pik · 2024-03-01T09:35:50Z

Will work on this next week (KW10)

fbenke-pik · 2024-03-01T15:06:06Z

I rather think of a one-sided function that restricts and expands x to the size of ref:

Keeping the use case of this helper in remind2 reporting in mind, I'd assume it should

expand x to ref, but not restrict it to ref
~~only be applied to dim 1 and 2 (region and year)~~ Nvm, thinking about it more, I guess it won't hurt to expand to dim 3

fbenke-pik · 2024-03-04T16:49:53Z

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q could you point me to gdxes with irregularities that could make reportEmi and reportFE crash if I don't restore zeros?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q · 2024-03-05T21:40:50Z

Nope.
@mellamoSimon worked on this, I only surmised why he did what he did. I have no idea if that is a problem that crops up regularly, or if he encountered it at all, or if he just did as the Romans do.
But in theory you should be able to test this with every instance of readGDX(restore_zeros = TRUE):

$ ag 'restore_zeros *= *T'
R/reportFE.R
82:  vm_demFENonEnergySector <- readGDX(gdx, "vm_demFENonEnergySector", field = "l", restore_zeros = T, react = "silent")[,t,]*TWa_2_EJ

R/reportSE.R
77:    storLoss <- readGDX(gdx, name = c("v32_storloss", "v_storloss"), field = "l", restore_zeros = TRUE, format = "first_found") * pm_conv_TWa_EJ
434:  pm_fuExtrOwnCons <- readGDX(gdx, "pm_fuExtrOwnCons", restore_zeros = T)[,,getNames(pm_fuExtrOwnCons_reduced)]

R/reportPrices.R
991:  o01_CESderivatives <- readGDX(gdx, "o01_CESderivatives", restore_zeros = T, react = "silent")

R/reportEmi.R
149:  # note: this has to be read in with restore_zeros=T because sometimes it contains only non-zero values for "ETS" emiMkt
151:  pm_IndstCO2Captured <- readGDX(gdx, "pm_IndstCO2Captured", restore_zeros = T,
196:  vm_emiIndCCS <- readGDX(gdx, "vm_emiIndCCS", field = "l", restore_zeros = T)[, t, ]
332:  vm_plasticsCarbon <- readGDX(gdx, "vm_plasticsCarbon", field = "l", restore_zeros = T, react = "silent")[,t,]
338:    vm_feedstockEmiUnknownFate  <- readGDX(gdx, "vm_feedstockEmiUnknownFate", field = "l", restore_zeros = T, react = "silent")[,t,]
339:    vm_incinerationEmi          <- readGDX(gdx, "vm_incinerationEmi", field = "l", restore_zeros = T, react = "silent")[,t,]
340:    vm_nonIncineratedPlastics   <- readGDX(gdx, "vm_nonIncineratedPlastics", field = "l", restore_zeros = T, react = "silent")[,t,]
2704:  p47_LULUCFEmi_GrassiShift <- readGDX(gdx, "p47_LULUCFEmi_GrassiShift", restore_zeros = T, react = "silent")[getRegions(out), getYears(out),]

mellamoSimon · 2024-03-06T09:37:44Z

hey, yeah, that's a good assumption, I eventually set restore_zeros = T because the script would fail otherwise. I did that long ago though so I don't recall the exact mismatch of dimensions that would happen. I can try to reproduce it and let you know? @fbenke-pik

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q · 2024-03-06T10:41:15Z

Keeping the use case of this helper in remind2 reporting in mind, I'd assume it should

expand x to ref, but not restrict it to ref

But the use case in remind2 does always include also a restriction. All the examples above have something like [,t,] attached to the readGDX() call (except for storLoss, which does it in a funky way a couple of lines below).
I agree with @orichters on this one.

fbenke-pik · 2024-03-06T16:23:40Z

Thanks Simón. Right now I favour the following approach:

Avoid restoring zeros whenever I see no immediate need to do so
Run adjusted reporting on all the latest AMT gdxes
Deal with the crashes coming out of this individually
Merge and keep an eye on the reporting

Let me know in case there are objections. If not, then nothing to do on your end, Simón.

mellamoSimon · 2024-03-07T09:19:26Z

Hi Falk, that sounds good, thank you!

fbenke-pik · 2024-03-12T10:22:56Z

Looks like this PR should solve the problems:
#557

Could solve the problems mostly by introducing new options to readGDX (pik-piam/gdx#8).

As of now, I don't see a need for somebody_name_this_function.

Validation:

I reran validation on the latest REMIND runs that actually produced a gdx /p/tmp/benke/validation
They all succeeded with mem=8000
Summation problems seem to be the same as /p/tmp/benke/validation/validate_remind2-29209181.out
Some gaps have gotten smaller actually (does not have to be due to this PR though, but other changes since original reporting ran), none have increased from what I can see

fbenke-pik · 2024-03-12T14:18:42Z

Leaving this open, as it will probably become relevant in the future again. (see #524)

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q · 2024-03-13T09:22:23Z

The remindmodel/remind#1609 problem comes down to the input.gdx, which is from the current set of CES parameters sports a whole lot of zeros for SE/FE technologies that do not actually exist.

readGDX(gdx, 'vm_demFENonEnergySector', field = 'l', restore_zeros = FALSE)[,t,] %>%
    magclass_to_tibble() %>% 
    anti_join(readGDX(gdx, 'sefe')) %>% 
    filter(!is.na(value)) %>% 
    distinct(all_enty, all_enty1, .keep_all = TRUE)
Joining with `by = join_by(all_enty, all_enty1)`
# A tibble: 19 × 7
   all_regi  ttot all_enty all_enty1 emi_sectors all_emiMkt value
   <chr>    <int> <chr>    <chr>     <chr>       <chr>      <dbl>
 1 EUR       2005 seliqbio fegas     indst       ETS            0
 2 EUR       2005 seliqbio fesos     indst       ETS            0
 3 EUR       2005 seliqsyn fegas     indst       ETS            0
 4 EUR       2005 seliqsyn fesos     indst       ETS            0
 5 EUR       2005 sesobio  fegas     indst       ETS            0
 6 EUR       2005 sesobio  fehos     indst       ETS            0
 7 EUR       2005 seel     fegas     indst       ETS            0
 8 EUR       2005 seel     fehos     indst       ETS            0
 9 EUR       2005 seel     fesos     indst       ETS            0
10 EUR       2005 seh2     fegas     indst       ETS            0
11 EUR       2005 seh2     fehos     indst       ETS            0
12 EUR       2005 seh2     fesos     indst       ETS            0
13 EUR       2005 segabio  fehos     indst       ETS            0
14 EUR       2005 segabio  fesos     indst       ETS            0
15 EUR       2005 segasyn  fehos     indst       ETS            0
16 EUR       2005 segasyn  fesos     indst       ETS            0
17 EUR       2005 sehe     fegas     indst       ETS            0
18 EUR       2005 sehe     fehos     indst       ETS            0
19 EUR       2005 sehe     fesos     indst       ETS            0

Probably some REMIND/GAMS carry-over.

ricardarosemann · 2024-03-13T14:34:07Z

I also came across the issue of the non-existing SE/FE technologies today, when trying to do reporting for a Remind Base run, which failed due to the vm_demFeSector/vm_demFENonEnergySector mismatch. Falk's merge fortunately fixed that.
I tried to trace down the origin of this error and I would guess that the non-existing SE/FE technologies might come from (in Remind)

q37_FossilFeedstock_Base(t,regi,entyFe,emiMkt)$(
                         entyFE2sector2emiMkt_NonEn(entyFe,"indst",emiMkt)
                     AND cm_emiscen eq 1                                   ) ..
  sum(entySe, vm_demFENonEnergySector(t,regi,entySe,entyFe,"indst",emiMkt))
  =e=
  sum(entySeFos,
    vm_demFENonEnergySector(t,regi,entySeFos,entyFe,"indst",emiMkt)
  )
;

as there is no sefe mapping active here (and this equation is only active for Base runs). Maybe it's worth looking into this.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q · 2024-03-14T07:37:06Z

Thanks Ricarda.
@orichters, was remindmodel/remind#1609 a Base run?

orichters · 2024-03-14T07:44:46Z

The error that occured? That was in a testOneRegi.

ricardarosemann · 2024-03-14T07:48:21Z

testOneRegi runs should also be subject to the equation I posted - as it also has cm_emiscen = 1 (the default).

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q · 2024-03-14T08:25:59Z

No, you can run testOneRegi for any configuration you want. Baseline, NPi, Policy, calibration.
q37_FossilFeedstock_Base is standing next to the Guillotine anyhow, so …

orichters · 2024-03-14T08:31:34Z

Well, it was a testOneRegi as called by make test, so with default settings.

fbenke-pik · 2024-04-30T14:40:01Z

Closing, as this seems to be stable now, not causing any further issues.

orichters assigned 0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q and fbenke-pik Feb 27, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q removed their assignment Feb 27, 2024

This was referenced Feb 28, 2024

increase memory on single node runs remindmodel/remind#1567

Merged

repair 'make test-full': remove fails for coupled runs, cleanup start_bundle_coupled, remove outdated configs remindmodel/remind#1580

Merged

mellamoSimon mentioned this issue Mar 12, 2024

avoid restoring zeros in reporting #557

Merged

fbenke-pik mentioned this issue Mar 13, 2024

add matchDim helper #563

Merged

fbenke-pik closed this as completed Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory and mbind problems in convGDX2MIF / reportFE #549

memory and mbind problems in convGDX2MIF / reportFE #549

orichters commented Feb 27, 2024 •

edited

Loading

orichters commented Feb 27, 2024 •

edited

Loading

orichters commented Feb 27, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Feb 27, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Feb 27, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Feb 28, 2024

orichters commented Feb 28, 2024

fbenke-pik commented Mar 1, 2024

fbenke-pik commented Mar 1, 2024 •

edited

Loading

fbenke-pik commented Mar 4, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 5, 2024

mellamoSimon commented Mar 6, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 6, 2024

fbenke-pik commented Mar 6, 2024 •

edited

Loading

mellamoSimon commented Mar 7, 2024

fbenke-pik commented Mar 12, 2024

fbenke-pik commented Mar 12, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 13, 2024

ricardarosemann commented Mar 13, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 14, 2024

orichters commented Mar 14, 2024

ricardarosemann commented Mar 14, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 14, 2024

orichters commented Mar 14, 2024

fbenke-pik commented Apr 30, 2024

memory and mbind problems in convGDX2MIF / reportFE #549

memory and mbind problems in convGDX2MIF / reportFE #549

Comments

orichters commented Feb 27, 2024 • edited Loading

orichters commented Feb 27, 2024 • edited Loading

orichters commented Feb 27, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Feb 27, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Feb 27, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Feb 28, 2024

orichters commented Feb 28, 2024

fbenke-pik commented Mar 1, 2024

fbenke-pik commented Mar 1, 2024 • edited Loading

fbenke-pik commented Mar 4, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 5, 2024

mellamoSimon commented Mar 6, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 6, 2024

fbenke-pik commented Mar 6, 2024 • edited Loading

mellamoSimon commented Mar 7, 2024

fbenke-pik commented Mar 12, 2024

fbenke-pik commented Mar 12, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 13, 2024

ricardarosemann commented Mar 13, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 14, 2024

orichters commented Mar 14, 2024

ricardarosemann commented Mar 14, 2024

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented Mar 14, 2024

orichters commented Mar 14, 2024

fbenke-pik commented Apr 30, 2024

orichters commented Feb 27, 2024 •

edited

Loading

orichters commented Feb 27, 2024 •

edited

Loading

fbenke-pik commented Mar 1, 2024 •

edited

Loading

fbenke-pik commented Mar 6, 2024 •

edited

Loading