vignettes/SUtools.Rmd
SUtools.Rmd
There are many important R packages by Stanford authors using Fortran
for speed and efficiency. For example, Jerome Friedman, Trevor Hastie
and Robert Tibshirani have several packages, glmnet
,
glasso
,
gam
to
name a few, that make use of Fortran. The Fortran in many cases is
generated using a preprocessor called Mortran (m77
). It is
a bit of a chore to ensure that the generated Fortran code generates no
warnings and registers the native routines as required by CRAN. This
package is an attempt o make it almost automatic.
I’d put together code for this several times over the years but never quite organized it in one place.
A related package ftest provides a complete example of calling back a user-defined R function from Fortran.
While this package is directed specifically towards Mortran, there are several functions that many may find useful.
The registration function (gen_registration
) can be
used for ordinary Fortran to generate registration code automatically.
It only requires the subroutine or function call statement as input; see
?gen_registration
.
The function fix_unused_labels
can be used to
automatically fix unused label warnings that are generated by CRAN flags
for Fortran.
We will use an example package pcLasso
which has a
Mortran file (included in this package) that is used for the actual
computations along a path of values for \(\lambda\).
If all goes well, only one real function call is needed to generate both the Fortran and registration code.
library(SUtools)
mortran_file <- system.file("misc", "pcLasso.m", package = "SUtools")
result <- process_mortran(input_mortran_file = mortran_file,
pkg_name = "pcLasso",
control = sutools_control(fix_allocate = TRUE))
## Processing Mortran: reading file
## Processing Mortran: fixing allocate statements
## Processing Mortran: inserting implicit statements
## Processing Mortran; replacing reals by double precision
## Checking for long lines; can cause problems downstream if not fixed
## Note: Some lines could become longer, > 72 cols in %FORTRAN sections
## and > 80 cols in MORTRAN sections as a result of 'real' being
## replaced by 'double precision'. Split such lines into two;
## in %FORTRAN sections, use a continuation character in col 6.
## Seems ok, continuing
## Generating Fortran from Mortran
## Checking Fortran
## Chopping Lines at 72 cols
## Running gfortran to detect warning lines on unused labels
## Scanning gfortran output for warnings on unusued labels
## Generating Init function for package pcLasso
This will return a list of three items: a cleaned-up version of the
mortran named mortran
, the corresponding cleaned and
processed fortran named fortran
, and registration C code
named pcLasso_init.c
where the name is constructed from the
package name provided. The fortran and the registration code can be
saved in appropriate files in pcLasso/src
simply by using
base::writeLines
.
writeLines(result$fortran, "pcLasso/src/pcLasso.f")
writeLines(result[[3]], paste0("pcLasso/src/", names(result)[[3]]))
The process_mortran
function goes through several
steps.
allocate
statements in the Mortran
file. (This step is skipped if the option fix_allocate
is
FALSE
, the default!) Lines of the type:allocate(a(1:ni),stat=ierr);
are replaced with
allocate(a(1:ni),stat=jerr); if(jerr.ne.0) return;
ensuring that warnings for the variable jerr
being
ignored go away
implicit double precision(a-h,o-z)
statement to ensure double precision calculations.
All real
variables are replaced with
double precision
variables.
Constants such as [eE][+-]?[0-9]+.
are replaced by
double precision equivalents.
As a result of replacing real
with
double precision
, there is a possibility for some lines to
go over the 72 character limit, in the Fortran sections. For Mortran,
this limit is 80 characters and that is also checked.
If this check fails, the function exits with a detailed list of things and approximate line numbers for the user to address.
The Mortran executable is run on the Mortran file to produce the
Fortran file with extension .for
.
There is extraneous stuff that Mortran adds which can again
trigger gfortran
warnings. So this step chops off things
beyond 72 columns to yield a .f
file.
Next gfortran
is run on the code in the
.f
file with flags -Wunusued
to detect unused
labels. The output is then scanned for the warning messages.
If the warning messages pertain only to unusued labels, an automatic fix is made on the generated Fortran file. Otherwise, an informative message is printed with hints on how to fix the source.
If registration is asked for and a package name is provided, the
registration is code is generated in a file typically named based on the
package, pcLasso_init.c
in our example. This is done by
scanning the Mortran file for subroutine
declarations (even
if they span several lines) and using implicit Fortran conventions to
generate C registration code.
If one provides the package name, registration for all the
subroutines in the mortran file will be generated, even those that
are not called from R. This may not be desirable. One can
selectively generate registration for certain subroutines as is
illustrated in the example for the function
gen_registration
.
Included in this package is a modified version of the mortran preprocessor which is then used as a utility to automate tasks as much as possible.
I am sure this package could be improved. Feel free to make a PR.