VAT: Incl. excl.



2016 Nordic and Baltic Stata Users Group Meeting in Oslo, Norway, Tuesday 13 September, 09:00-17:00


The 2016 Nordic and Baltic Stata Users Group meeting will be held in Oslo, Norway at Oslo Cancer Cluster Innovation Park on Tuesday 13 September, 09:00-17:00.



Cancer Registry of Norway – Institute of Population Based Epidemiology

Metrika Consulting AB, the official Stata distributor for the Nordic and Baltic countries.


Scientific committee

Tor Åge Myklebust, Statistician, The Cancer Registry of Norway.

Arne Risa Hole, Reader in Economics, Univ. Sheffield.

Hein Stigum, Senior scientist, Norwegian Institute of Public Health.

Morten Wang Fagerland, Biostatistician, Oslo Centre for Biostatistics and Epidemiology (OCBE).

Peter Hedström, Professor, Institute of Analytical Sociology, Linköping University.



Attending representatives from StataCorp

David Drukker, Executive Director of Econometrics

Yulia Marchenko, Executive Director of Statistics



The new research building (Building K), Auditorium FOBY (Forskningsbygget), Radiumhospitalet, Oslo University Hospital, Ullernchausseen 70, 0310 Oslo, Norway.



Registration via is now closed.

Any late registrations should be sent to


Programme overview

08:30-08:55 Registration

09:00-09:05 Welcome and introduction

09:05-12.00 Session 1. Chair: Tor Åge Myklebust

12:00-13:00 Lunch

13:00-16:30 Session 2. Chair: Morten W. Fagerland

16:30-17:00 Wishes and grumbles session. Chair: StataCorp



Programme details with abstracts

08:30-08:55 Registration


09:00-09:05 Welcome and introduction


09:05-09:20 Processing work-history data to quantify occupational exposure using Stata

Ronnie Babigumira, Jo S Stenehjem, Tom K Grimsrud (Cancer Registry of Norway)

Work-history data can be linked to job-exposure matrices (JEMs), containing job-specific exposure ratings across different time periods, for retrospective assessment of job-specific exposure. However, work-history data often present major challenges including mapping job titles into standardized occupational codes, quality and consistency checks, and complexity arising from overlapping employment spells. We demonstrate how Stata can be used to resolve some of these challenges, specifically, complex spell structure arising because of overlapping employment periods and gaps.


09:20-09:35 Clinical database management: from raw data through study tabulations to analysis datasets

Inge Christoffer Olsen (Diakonhjemmet Hospital)

There are about as many ways to compile datasets for analyses as there are statisticians or epidemiologists. Some like to have wide datasets with one observation per subject and tons of variables, while other like the long format with many observations per subject. In this talk I will present the Raw/TD/AD (Raw/Tabulation Datasets/Analyses Datasets) method inspired by the standards for clinical datasets set by the Clinical Data Interchange Standards Consortium (CDISC). The CDISC standards are widely adopted by the pharmaceutical industry, but less so within academia probably due to the rigidity of the standards. While the standards, specifically the Study Data Tabulation Model (SDTM) and then Analysis Data Model (AdAM) in their full extent probably is to extensive for academic researchers, elements of the standards could inspire for a more rigorous setup of clinical databases.
I will present the basic structure of first compiling raw study data into standard datasets such as study visits, demographics, vital signs etc., and then compiling analyses datasets introducing derived variables, imputations and other formatting to form datasets ready for analyses. The talk will be given with examples from a recently finished randomised controlled trial.


09:35-09:50 Moving from SAS to Stata, making customised tables in RTF using the rtfutil and other packages

Inge Christoffer Olsen (Diakonhjemmet Hospital)

When moving from SAS to Stata, one of the major drawbacks is the ability to produce customised tables to be viewed in MS Word, as available in SAS using the the Output Delivery System (ODS) destination for RTF. Most medical articles are prepared for submission in Word, and it is preferable to produce ready-to-use tables without the need for error prone cutting and pasting from the Results window. In this talk I will present how this can be done in Stata using the user written packages such as parmest, xcontract, xcollapse for making datasets of results (often denoted resultssets), and then listtab and the rtfutil package for producing the RTF tables. The rtfutil package can also be used to include graphics into the RTF file, enabling the production of study reports and tables, listing and figures (TLFs). The talk will be illustrated by examples from a recent randomised controlled trial.


09:50-10:35 Creating LaTeX and HTML documents from within Stata using texdoc and webdoc

Ben Jann (University of Bern)

At the 2009 meeting in Bonn in I presented a new Stata command called -texdoc-. The command allowed weaving Stata code into a LaTeX document, but its functionality and its usefulness for larger projects was limited. In the meantime, I heavily revised the -texdoc- command to simplify the workflow and improve support for complex documents. The command is now well suited, for example, to generate automatic documentation of data analyses or even to write an entire book. In this talk I will present the new features of -texdoc- and provide examples of their application. Furthermore, I will present a newly released companion command called -webdoc- that can be used to produce HTML or Markdown documents.


10:35-11:00 Coffee break


11:00-12:00 What does your model say? It may depend on who is asking

David M. Drukker (StataCorp)

Doctors and consultants want to know the effect of a covariate for a given covariate pattern. Policy analysts want to know a population-level effect of a covariate. I discuss how to estimate and interpret these effects using factor variables and margins.


12:00-13:00 Lunch


13:00-13:30 Multi-state survival analysis in Stata

Michael J. Crowther and Paul C. Lambert (University of Leicester & Karolinska Institutet)

Multi-state models are increasingly being used to model complex disease profiles. By modelling transitions between disease states, accounting for competing events at each transition, we can gain a much richer understanding of patient trajectories and how risk factors impact over the entire disease pathway. In this talk, we will introduce some new Stata commands for the analysis of multi-state survival data. This includes -msset-, a data preparation tool which converts a dataset from wide (one observation per subject, multiple time and status variables) to long (one observation for each transition for which a subject is at risk for). We develop a new estimation command, -stms-, which allows the user to fit different parametric distributions for different transitions, simultaneously, whilst allowing sharing of covariate effects across transitions. Finally, -predictms-, which calculates transitions probabilities, and many other useful measures of absolute risk, following the fit of any model using -streg-, -stms-, or -stcox-, using either a simulation approach or the Aalen-Johansen estimator. We illustrate the software using a dataset of patients with primary breast cancer.


13:30-14:30 Joint modeling of longitudinal and survival data

Yulia Marchenko (StataCorp)

Joint modeling of longitudinal and survival-time data has been gaining more and more attention in recent years. Many studies collect both longitudinal and survival-time data. Longitudinal, panel, or repeated-measures data record data measured repeatedly at different time points. Survival-time or event history data record times to an event of interest such as death or onset of a disease. The longitudinal and survival-time outcomes are often related and should thus be analyzed jointly. Three types of joint analysis may be considered: 1) evaluation of the effects of time-dependent covariates on the survival time; 2) adjustment for informative dropout in the analysis of longitudinal data; and 3) joint assessment of the effects of baseline covariates on the two types of outcomes. In this presentation, I will provide a brief introduction to the methodology and demonstrate how to perform these three types of joint analysis in Stata.


14:30-15:00 Coffee break


15:00-15:30 Creating efficient designs for discrete choice experiments

Arne Risa Hole (University of Sheffield)

Over the past decades the discrete choice experiment (DCE) has become a popular tool for investigating individual preferences in several fields. This talk will describe the dcreate command which creates efficient designs for DCEs using the modified Fedorov algorithm. The algorithm maximises the D-efficiency of the design based on the covariance matrix of the conditional logit model.


15:30-16:00 Using monte carlo simulation for non-standard sample size/power calculation

Mike Jones (Macquarie University, Australia)

Prospective sample size calculation is an important aspect of study design as is retrospective power calculation, particularly when statistical significance is not achieved. For comparatively simple hypothesis tests applied to simple experimental designs these quantities can be calculated using closed form analytic expressions. However as designs and/or models become more complicated the derivation of power functions becomes difficult and simulation is often used to when analytic approaches become intractable. This talk will illustrate the use of Stata’s simulation capabilities to calculate statistical power for hypothesis tests based on arbitrarily complex statistical models. Once a model is specified as an alternate hypothesis simulation is typically straightforward and Stata’s ability to capture and accumulate model parameters enables straightforward calculation of statistical power.


16:00-16:30 The Case-Cohort design: What it is and how it can be used in register-based research

Anna Johansson (Karolinska Institutet) 

This presentation will give a brief theoretical background and history of case-cohort studies, which dates back to the key publication by Prentice in 1986. Examples of situations when the case-cohort design is useful will be given, in particular in a register-based setting with total population registers. The case-cohort design will be compared to the nested case-control design, and advantages and disadvantages will be presented. From a case-cohort design it is possible to estimate the same measures of effects (e.g. hazards, hazard ratios, hazard differences) that can be estimated in a standard cohort study, provided that weights are included to account for the over-sampling of cases. Hence, in practice, the analysis of a case-cohort study is similar to that of a cohort study (e.g. Cox regression, Poisson regression and flexible parametric models), with the addition of proper weights. Stata code for how to sample a case-cohort study from a cohort study and how to incorporate weights into the analysis will be presented. As an example, we will present a study on risk for breast cancer following pregnancy using data from the Swedish Multi-Generation Register and the Swedish Cancer Register. In this study we utilised the case-cohort design to reduce the analytical dataset and to improve computational efficiency.


16.30-17.00 Wishes and grumbles


This session gives you the opportunity to air your thoughts to Stata developers.



Please consult the official travel guide to Oslo:

From Oslo Airport, the Airport Express Train leaves every 15 minutes and it takes 25 minutes to Nationaltheatret Station next to the metro station at the main street Karl Johans gate.

The westbound metro line 3 (Kolsås) will take 14 minutes to Montebello metro station from which the meeting venue at Oslo Cancer Cluster Innovation Park is an easy 7 minutes downhill walk.



In the city center area close to the main street Karl Johans gate you find a range of hotels. Here are some you might consider:

5-star: Hotel Continental.

4-star: Thon Hotel Rosenkrantz.

3-star: Best Western Karl Johan Hotel.

2-star: Cochs Pensjonat in the corner of Royal Palace Park with central location, solid accommodation and reasonable prices.

Norlandia Hotell Montebello, at the Norwegian Radium Hospital, is a patient hotel with limited places during the week for non-patients.

Further information:



The meeting is free, but there will be an option to order lunch for which a small fee will be charged. Due to the few dining alternatives at the conference venue we recommend ordering the optional lunch.

Post-conference course

A one day course in "Flexible Parametric Survival Models" by Professor Paul C. Lambert will be held on September 14, 2016, the day after the Stata Users Group meeting. Please contact the meeting coordinator Bjarte Aagnes at for further information about this event.