The theory behind mSR MINUIT function subroutines
Ivan Reid, PSI mSR Facility
Aug 21, 2000

These notes are as much a help to myself as to anyone else. I did write something similar at TRIUMF many years ago, but that was pre-L^AT_EX, so the maths was hand-written and never survived.

Calculating derivatives

The most powerful feature of MINUIT is the MIGRAD minimisation routine. However, this procedure requires the gradients of the minimisation function with respect to all the parameters; i.e. the partial derivatives. It is obviously much more efficient to supply these within the user-written function (called generically FCN) than to have MINUIT calculate them by successively varying each parameter.

In general, the function being minimised is a c² sum:

S =
å
i
(d_i - Y(t_i))²
s_i²

where the sum is over all the channels of data, d_i is the data, Y(ti) is the function being fitted, and s_i is the standard deviation in the data. Since mSR data are statistical in nature, the standard deviation is given by s_i = Ö[(d_i)]. A simplification a rises if we consider the standard deviation to be that of the function Y rather than that of the data. The sum then becomes

S =
å
i
(d_i - Y(t_i))²
Y(t_i)

and the gradients are the partial derivatives ¶S/¶p_n with respect to the parameters p_n:

¶S
¶p_n

=

¶
¶p_n

å
i
(d_i - Y(t_i))²
Y(t_i)

=

å
i
2
(d_i - Y(t_i)) ¶Y(t_i)
¶p_n
Y(t_i)
-
(d_i - Y(t_i))² ¶Y(t_i)
¶p_n
Y(t_i)²

=

å
i
- é
ê
ë 2(d_i -Y(t_i))
Y(t_i)
+ (d_i - Y(t_i))²
Y(t_i)²
ù
ú
û ¶Y(t_i)
¶p_n

=

å
i
- é
ê
ë 2 d_i Y(t_i) - 2 Y(t_i)² +d_i² -< /font >2 d_i Y(t_i)+Y(t_i)²
Y(ti)²
ù
ú
û ¶Y(t_i)
¶p_n

=

å
i
Y(t_i)² - d_i²
Y(t_i)²
¶Y(t_i)
¶p_n

=

å
i
æ
ç
è 1- d_i²
Y(t_i)²
ö
÷
ø ¶Y(t_i)
¶p_n

Note that if the data have been averaged over k bins, as I am currently doing with MINFIT so that the normalisation and background remain constant despite packing, the above derivation needs to be corrected. The true c² sum in this case is

S_k

=

å
i
(kd_i - kY(t_i))²
kY(t_i)

=

kS

so that in order to maintain the correct values for c² and the gradients, the calculated values must be multiplied by the packing (or binning) factor k.

For mSR data, the theory functions take the form

Y(t) = b + Ne^-t/t ( 1 + f(t) )

where b is the background and N the normalisation at t₀. We thus hav e

¶Y(t)
¶b

=

1
< tr>
¶Y(t)
¶N

=

e^-t/t ( 1 + f(t) )
¶Y(t)
¶p_n

=

Ne^-t/t ¶f(t)
¶p_n

For the case now where f(t) = å_m a_m g_m(t), which again corresponds to mSR experiments,

¶f(t)
¶a_m
= g_m(t)

and

¶f(t)
¶p_n
=
å
m
a_m ¶g_m(t)
¶p_n

where obviously the sum is only over those signals g_m(t) which actual ly depend on the given parameter p_n.

I have written the ``new'' MINUIT fitting routines based on the above treatment. Firstly, there are several discrete ``functions'' based on a full implementation of c² and derivative calculations. These are (or at least can be) the most efficient implementations, but must be tailor-made for each separate new combination of signals. For these, I have defined two types of the parameters p_n. The first are ``global'' parameters which are true for all histograms, such as magnetic field and relaxation parameters, while the second type are those which pertain to individual histograms, such as normalisation, background, amplitudes, phases, etc. Each separate function has a name associated with it (I have made heavy use of VAX FORTRAN's STRUCTURE facility to associate properties belonging to each function) whic h is currently (21/3/91) used to open a configuration or ``initial-guess'' file name.CON (e.g. GAUSSIAN.CON for th e case of a muon signal with Gaussian relaxation). I will provide fuller details of this at a later time elsewhere.

One of these functions allows the inclusion of several ``signals'' into the theoretical form - these are the direct analogues of the g_ms abov e. Each signal (defined in another STRUCTURE) now has four types of parameters, those which are global to all histograms and may be shared with another signal (e.g. magnetic field in a two-muon fit such as TRIUMF's MOLIONFCN molecular-ion function), those which are global but not shareable (e.g. relaxation parameters), those which differ for each histogram, like amplitudes a_m, and those which differ for each histogram but may be s hared with another signal, such as phases f_m in a LF experiment. The normalisations and background for each histogram are also separate parameters. To date, I haven't thought up a sensible way of creating an initial-guess file for this compound function, since the signals may be combined in any arbitrary way.

Each signal FCN routine needs only to know about its own parameters, and in fact I maintain arrays in virtual memory of the signals a_mg_m(t) and, where necessary, the partial derivatives am ¶g_m(t)/ ¶p_n. The separation of variables is achieved through an offset table holding the function's parameter number for each signal parameter - this is what also allows two signals to share a common parameter. The final calculations of Y(t) and S, and also the full derivatives in a gradient pass, are done in a separate subroutine.

File translated from T_EX by T_TH, version 2.67.
On 21 Aug 2000, 11:17.