Although pain is widely recognized to be a multidimensional experience, it is typically measured by unidimensional patient self-reported visual analog scale (VAS). However, self-reported pain is subjective, difficult to interpret and sometimes impossible to obtain. Machine learning models have been developed to automatically recognize pain at both the frame level and sequence (or video) level. Many methods use or learn facial action units (AUs) defined by the Facial Action Coding System (FACS) for describing facial expressions with muscle movement. In this paper, we analyze the relationship between sequence-level multidimensional pain measurements and frame-level AUs and an AU derived pain-related measure, the Prkachin and Solomon Pain Intensity (PSPI). We study methods that learn sequence-level metrics from frame-level metrics. Specifically, we explore an extended multitask learning model to predict VAS from human-labeled AUs with the help of other sequence-level pain measurements during training. This model consists of two parts: a multitask learning neural network model to predict multidimensional pain scores, and an ensemble learning model to linearly combine the multidimensional pain scores to best approximate VAS. Starting from humanlabeled AUs, the model achieves a mean absolute error (MAE) on VAS of 1.73. It outperforms provided human sequencelevel estimates which have an MAE of 1.76. Combining our machine learning model with the human estimates gives the best performance of MAE on VAS of 1.48.