Home
User Guide - UK Data Service
Contents
1. cccccce ee eseesececeeeceaaeeeeeeceeeceaeeeseeeeeeeeanenseeeees 94 10 2 22 MainInt Seclnt Histlnt ccc ccccccceesssceeeceeeceanesseeeeeeeeeseeaeeeeeeeeeaeaaaes 94 10 3 Merging datasets with the Index file eee eeeceeeeeeeeeeeeeennaeeeeeeeeeeeee 95 11 APPENDIX D DERIVED VARIABLE GUIDE TO THE INDEX FILE 98 1 INTRODUCTION This user guide provides detailed information pertaining to data arising from all seven waves of the Longitudinal Study of Young People in England LSYPE managed by the Department for Education DfE and its predecessors These data are available to download from the UK Data Archive and iLSYPE Previous versions of this user guide were created by NatCen as part of work to enhance LSYPE data under contract from DfE This version of the guide along with its accompanying data completes the data enhancement process for the DfE managed waves of LSYPE All enhancement work for Wave 7 of the study including the final version of this guide was completed internally by DfE 1 1 Background of the Study The Longitudinal Study of Young People in England LSYPE is a large scale and innovative panel study of young people which began in 2004 Respondents were first interviewed in the spring of 2004 at age 13 and then subsequently interviewed annually until 2010 resulting in a total of seven waves LSYPE is one of the main information sources for the formation and appraisal of Government policie
2. Health Employment activity histo Current activity Second adult current activit Employment training and earnings Qualifications and education Benefits and tax credits Income estimate Job search aesa ar ooo l aS person section 30 Wave Wave Wave Wave Wave Wave Wave 6 oo A N Homework Study support Extra classes Future plans and advice nformation advice and guidance Higher education plans Higher education Potential higher education students Attitudes to higher education Attitudes to debt Attitudes to work Volunteering Voting Religion Relations with parents Perceived discrimination Knowledge of and intentions towards Apprenticeships and related schemes Mental health Life satisfaction Household responsibilities Childcare and caring responsibilities Care to learn Education Maintenance Allowance EMA Job search Use of leisure time Car use Sport frequency Risk factors truancy bullying smoking drugs Health and disability Relationships and sexualit Parental employment Income and benefits Internet access SIC SOC codes Data linkage consent Future contact details History section Waves 1 2 and 4 only Homework S y O Study support SS Extra classes oo Future plans and advice _ _ _ Information advice and guidance Higher education plans Higher education Potential higher education students Attitudes to higher education _ _ Attitudes to
3. was also included giving further specific information about the study An introductory letter was sent to all the respondents selected from the ethnic boost sample in early May 2007 This introduced them to the survey and explained that they would be contacted by an interviewer later in the summer Advance letters were additionally sent to the young people by interviewers just before commencing their assignment The advance letters were tailored for each of the sample groups those who were interviewed in Wave 3 those who had previously refused and asked to take part Wave 3 movers who were traced and ethnic boost respondents Advance letters were also sent to parents guardians of the young people and again these were tailored for the different sample groups Both letters advised that the interviewer would be calling at their address following the receipt of the letter The young person s letter included an unconditional 8 voucher incentive Specific instructions were provided to interviewers to deal with cases where the young person was not currently living in the parental home If the young person had moved out of the parental home and was living in another private household either independently or with a partner friends or relatives interviewers were asked to follow up the young person there If the young person had moved into armed forces accommodation the interviewer was asked to try to find out if the young person
4. East of England 11 0 London 12 7 South East 15 5 South West 9 8 Qualifications Not achieved level 1 10 5 Achieved level 1 but not 2 33 0 Achieved level 2 56 5 Sex Male 51 0 Female 49 0 With the exception of pupils from London the data were calibrated to marginal totals rather than the cell totals This means for example that the proportion of White respondents will be the same in the weighted sample and in the population at Wave 1 Similarly the proportion of pupils in the North East will be the same in the weighted sample and in the population Despite this the proportion of pupils in individual cells Such as White respondents in the North East might vary between the weighted sample and the population London was treated differently The ethnic breakdown of pupils in London is very different from that in other parts of the county therefore because responses among ethnic minorities in London was quite high it was possible to calibrate respondents in London to their cell totals 58 6 1 3 Combining maintained and independent school weights The final stage was to weight the sample so that the maintained independent school split matched the population proportions 92 5 maintained 7 5 non maintained This weight variable is called W1FinWt 6 1 4 Effects of the weighting The purpose of weighting is to eliminate bias in the estimates of population quantities However when the calculated weights are very varia
5. young person module There was no parental interview Identification of the main parent differed between the main sample households and the ethnic boost sample households Main Sample The main parent interview could be conducted with either parent guardian Interviewers were asked to pick the parent who they felt was more likely to take part This is a change from previous years where the main parent was identified as the parent most involved in the young person s education This was no longer necessary at Wave 4 as the parent interview focused largely on the employment status and health of the parents themselves Ethnic Boost Sample The main parent was determined by the following order of priority 1 Natural mother 2 Natural Father 3 The parent most involved in the sample member s education It should be noted that as in previous waves the main parent and second parent roles at previous waves were not necessarily carried forward to Wave 4 even in cases where the parent s were still living with the child For example at Wave 3 the mother may have answered the main parent questionnaire and the father the second parent but at Wave 4 these roles 22 could have reversed Cases where this occurred can be identified by comparing the positions in the Household Grid at previous waves 4 5 Fieldwork at Wave 5 Wave 5 fieldwork ran from 3 June 2008 to 28 October 2008 This wave involved a number of significant cha
6. 6 5 Wave 5 weights Weights to account for non response from certain groups between Waves 4 and 5 were calculated in two stages Firstly the design weights were selected to account for the probability of being in the sample At Wave 5 these were the final weights from Wave 4 With these weights applied the profile of the 70 issued cases was then compared to that of the achieved cases with regards to a range of variables from Wave 1 Similar to Wave 4 respondents from the main and boost cohorts had to be considered separately For the larger main cohort a logistic regression was carried out to see how well response could be predicted however the models tested were poor predictors for non response due to generally high response rates The estimates generated by the model were not similar enough to the actual response rates generated among the subgroups and so cell weighting was used instead A range of variables were tested with those used for the non response weights being combinations of sex and economic activity at Wave 4 The response rates for the groups that were used for weighting are shown in Table 19 Table 19 Response rates used to calculate Wave 5 main cohort weights Wave 4 Economic Activity Male Female Full time education 90 97 93 31 Full time work 80 42 84 85 Part work amp part training 86 75 87 56 Training course or Apprenticeship 89 01 86 33 Something else Not NEET 82 81 81 08 Something el
7. and refers to the Household Grid position as shown at WxHhid This variable is at the household level therefore all members of a household will have the same value unless they were not present at a specific wave 10 2 10 WxHistres This variable shows the position of the person who answered the History questionnaire and refers to variable WxHhid This variable is at the household level therefore all members of a household will have the same value unless they were not present at either Wave 1 Wave 2 or were not included as a boost case at Wave 4 10 2 11 WxMPint This variable identifies whether the Main Parent interview was conducted at wave x The variable is at the household level and as such all members of a household will have the same value unless they were not present at the wave in question 3 The history section was not included in the Wave 3 interview 4 Users should note that this variable does not indicate whether the interview section was partially or fully completed by the respondent 5 See Appendix D Section A for details on how this variable and the variables discussed in sections 8 2 11 to 8 2 14 were derived 91 10 2 12 WxSPint Waves 1 2 and 4 only This variable shows whether the Second Parent interview was conducted at Wave xError Bookmark not defined The variable is at the household level and as such all members of a household will have the same value unless they were not present at the wave in ques
8. step adoptive or foster parent During Wave 2 it became apparent that the relationship of partners of the main parent were not necessarily identified as being a parent or guardian to the young person Therefore the definition of the second parent was changed and all second 17 parents were identified as those who were a partner or spouse of the main parent Due to the number of second adults not interviewed at Wave 1 considerable efforts were made by interviewers to speak to these people at Wave 2 It should be noted that the main parent and second parent roles at Wave 1 were not necessarily carried forward to Wave 2 even in cases where the parent s were still living with the child For example at Wave 1 the mother may have answered the main parent questionnaire and the father the second parent but at Wave 2 these roles could have reversed Cases where this occurred can be identified by comparing the positions in the Household Grid at Waves 1 and 2 4 3 Fieldwork at Wave 3 Wave 3 fieldwork ran from the 21st April 2006 to 28 September 2006 Wherever possible interviewers were assigned to the same households they had interviewed at Wave 2 All interviewers were briefed via face to face briefings Advance letters were sent by interviewers just before commencing their assignment Letters were sent to all the young people who participated in Wave 2 and were selected for Wave 3 A second letter was also sent to the par
9. 88 10 2 Variable descriptions ccccseceeeeeeeeeeeeeeceeeeeeeeeeseeeeeeeseeaaeeeeeeteeeneeeees 88 WAT SUIVeYID e n EEEE AE uke AETS 89 102 2 AHIDI a aa e de teat cas Sts te deci EEE EEE EE AE a Aa 89 102 3 AGCies fete a ea a tetas See ats Aina teal a aaa raaa a aeaa 89 1024 Hdobm and Hador aser air aea ARAA EAA 90 10 25 SEXane daneri aani ia i e a aaa ees 90 10 26 RetoY Piironen eeann ees iad dtd ae ee 90 10 2 RetoY P2 rinii eesti nti aedeagal eval 90 10 2 8 WXMain eS siii aineta a dees echidna 91 10 2 9 WXSOCOlrES oarden i aid aii a etd 91 10 22 10 WXHISTES neiii indais ide ede a E 91 10 211 WXMPING sc ce i ctiehttaes aie ee ee ee 91 10 2 12 WxSPint Waves 1 2 and 4 OMy ec ceeeeeeeceeeeeeeeeeeeeeeeeeneeeeeeenaees 92 10 21 33 NWS Pintisccttcceeseostsetceetsatn n a erase ta aes 92 102 AAS WY RNs tars eee a cette Se actecieioa atta ee aes 92 10 2 15 WxHistint Waves 1 2 and 4 only eeeeeeeeeeeeeeeeeeeeeeeeeeteteeeeeeeeeeeee 92 VO 251 6 WXMOM er n naaa a ran daue sa banana dee eaSartade aePaansddaleeiens 93 102417 WX AtNO Reiss a a a davdsataced macs ga dbaada saa eaadaatiaaets can areeda eet 93 10 2 18 WxMPMother WxMPFathe 0ccccccccceseseeeecececeanensseeeeeesanenseeeees 93 10 2 19 WxSPMother WxSPFather cccecccccceceeeeseeeeceeeeeaeenseceeeeeeanenseeeees 93 10 2 20 W2Newmember W3Newmember W4Newmembet 00086 94 10 2 21 Nouse ANd WENOUSE
10. 93 Question routing error respondent asked question not relevant to their situation despite being correctly routed 94 Insufficient Information mainly used for derived variables and signifies that there is relevant information missing from source variables 95 Unable to classify code response cannot be allocated within defined code frame 40 100 996 997 998 999 MP SP YP unable to complete CASI section used to signify that a respondent was unable to complete the self completion section This value label was also used to identify respondents who had used an interpreter MP SP YP refused CASI section used to signify that a respondent refused to answer the self completion section MP SP not present used to signify that a respondent was not identified for this part of the questionnaire module i e respondent was a single parent MP SP YP not interviewed used to signify that a respondent was identified as eligible to answer the relevant questionnaire modules but was not interviewed this may be due to a number of reasons i e not being available on the day the interview was conducted Respondent declined to answer sexual experience questions No parent in household used in later waves where a respondent may live away from parents Script error data missing for question Interviewer missed question used to signify item non response due to interviewer CAPI error Missing househ
11. MP and the second parent SP The following three files have been deposited as cross sectional data files e Wave One LSYPE Family Background file May 2008 e Wave One LSYPE Parental Attitudes file May 2008 e Wave One LSYPE Young Person file May 2008 A fourth file represents the Household Grid information collected at Wave 1 e Wave One LSYPE Household Grid May 2008 The Household Grid file is as a hierarchical file therefore containing one row for each individual identified in the household This file contains a total of 70 643 cases representing the 15 770 households who participated in the survey The LSYPE Household Grid files are not deposited and are only available to approved researchers who make a request to the Department for Education DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk 32 5 2 Wave 2 deposited data Three data files have been deposited for Wave 2 based on the information collected from the young person main parent and the second parent The following three files have been deposited as cross sectional data files e Wave Two LSYPE Family Background file June 2008 e Wave Two LSYPE Parental Attitudes file June 2008 e Wave Two LSYPE Young Person file June 2008 A fourth file represents the Household Grid information
12. although the question specifies that it is referring to those who receive it not as an unemployed person Their responses have been retained in the data W6SexAgeYP A small number of young people responded with These responses have been very low ages to this question about age that they coded into category 1 Under had first consensual sex 14 W6SexSafeOften These multicoded variables include a third None response Not applicable respondent defined Therefore these variables have three categories 1 Yes 2 No 3 N a respondent defined rather than the two that would normally be expected with binary multicoded variables W6Wrk12YP This question has been asked of the young None person and parents in previous waves However in Wave 6 there was no response option for self employed young people who don t employ anyone else It may be that they responded Don t know instead In derived variables that include this variable anyone who is coded Don t know is coded to 0 employees in the derived variable Household Grid A small number of young people completing the Coded to 999 in the relevant online survey were swerved around the variables household section as it was felt that the interviewer notes about household members would not be understood by the respondent Table 27 Issues with Wave 7 data Variable section Problem identified Action taken w7NEE
13. collected at Wave 2 e Wave Two LSYPE Household Grid file June 2008 This file is as a hierarchical file therefore containing one row for each individual identified in the household This file contains a total of 62 314 cases representing the 13 539 households who participated in the survey The LSYPE Household Grid files are not deposited and are only available to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk 5 3 Wave 3 deposited data Three data files have been deposited for Wave 3 based on information collected from the young person main parent and the second parent The following three files have been deposited as cross sectional data files e Wave Three LSYPE Family Background file June 2008 e Wave Three LSYPE Parental Attitudes file June 2008 e Wave Three LSYPE Young Person file June 2008 A fourth file represents the Household Grid information collected at Wave 3 e Wave Three LSYPE Household Grid file June 2008 33 This file is a hierarchical file therefore containing one row for each individual identified in the household This file contains a total of 56 614 cases representing the 12 439 households who participated in the survey The LSYPE Household Grid files are not deposited and are only av
14. data to use for non response modelling and because of this different models were used for full and partially productive pupils 60 Fully productive respondents A logistic regression model was used to estimate the response probabilities of each pupil The data were weighted by the Wave 1 weight scaled to equal the achieved sample size before modelling Various variables were used as potential explanatory variables Some were variables obtained from the sample frame such as ethnicity Government Office Region GOR type of school pupil s qualifications Others were socio economic variables obtained from the MP s answers to the Wave 1 questionnaire such as the MP s single parent status current working status income support status etc Additionally some were other answers to the Wave 1 questionnaire such as use of cannabis language spoken at home and the pupil s plans on their education after reaching the school leaving age A forward stepwise logistic regression procedure was used to model whether or not a pupil responded Nine variables were identified as statistically significant at the 10 level These nine and another three YP s sex School s admission status and School s deprivation status were included in the final logistic regression model The twelve variables are summarised in Table 15 18 The final three were included because they had been used in the stratification when the original selection was
15. debt o Attitudes to work __ i O Volunteering o oS Voting S O Religon SS O Relations with parents Perceived discrimination _ _ Mental health Life satisfaction __ _ Household responsibilities _ _____ Childcare and caring responsibilities Care toleam _ Education Maintenance Allowance EMA Job search ooo Use of leisure time __ _ Caruse o Sport frequency o Risk factors truancy bullying smoking drugs Health and disability __ _ Relationships and sexuality Parentalemployment _ Income and benefits _ Internet access ooo SIC SOC codes ooo Data linkage consent Future contact details _ _ Birth O ei fa oe Health oo O ei Ooo E School history _ e e Choice of currentschool Sl O Sibling experience El Relationship history Ooo Living with young person Ooo at Reasons for not living with natural parents bs Le Section included o Section asked of boost respondents only i Asked of MP only 31 5 SURVEY CONTENT 5 1 Wave 1 deposited data The Wave 1 LSYPE dataset was originally deposited in December 2006 Since this time extensive work has been undertaken to enhance the data by updating the variable names and labels and cleaning any inconsistencies within the file Three data files have been deposited for Wave 1 based on information collected from the young person YP the main parent
16. each household member to the young person If the information collected suggested that the young person was a parent to another household member but this other household member was older than the young person then this information would be edited Edits are only carried out if a relevant correction is easily identified for example if we know the household member is actually the parent of the young person then we would amend the relationship to indicate this If we were unable to identify a 44 correction using the data available for example the relationship is unknown then a system missing value is created A number of variables have been derived to enhance the data details of these are available in accompanying derived variable documentation During the process of deriving certain variables it became necessary to edit the data within the derivation leaving the raw data unedited This mainly affected income derivations at Wave 1 Collecting data on income is notoriously difficult as respondents may refuse to answer these questions and in other cases there are some obvious instances of respondent or interviewer reporting error e g reporting an income of 32 per annum instead of 32 000 Where possible we have corrected this information within the derived variable A slight amendment has also been made to the household NS SEC variables derived at Waves 3 4 and 5 In previous waves the NS SEC variables were derived
17. history Oe J Employment training and earnings e J e Benefits and tax credits o o d e f Jo ST Income estimates CE l e e Young person section Parental employment a Ea e Section included o Section asked of boost respondents only x Asked of MP only Note At Wave 6 and Wave 7 there is no Family Background file 5 15 2 LSYPE Parental Attitudes file The content of the Parental Attitudes file is summarised in Table 4 indicating the level of information available at each wave Table 4 Summary of content of LSYPE Parental Attitudes file Questionnaire Section and Content Wave 1 Wave2 Wave3 Wave4 Main parent section a a involvement in education Extra curricular classes __ e e e Year 10 subject choices e e Special educational needs e e e Parental expectations and aspirations School histo eS ee eee eee Year 11 experiences ae ee ee ee Post 16 plans ff i e History section Choice of current school pe doo d l e Section included o Section asked of boost respondents only Note At Waves 5 6 and 7 there is no Parental Attitudes file 47 5 15 3 LSYPE Young Person file The content of the Young Person file is summarised in Table 5 indicating the level of information available at each wave 48 Table 5 Summary of content of LSYPE Young Person file Questionnaire Section and Content Wave Wave Wave Wave Wave Wav
18. is alternatively available from the LSYPE Monthly Main Activity file for Waves 4 to 7 See Section 5 8 for more information on this file 36 5 7 Wave 7 deposited data One data file has been deposited for Wave 7 based on the information collected from the young person This has been deposited as a cross sectional data file e Wave Seven LSYPE Young Person file November 2011 Two further files for Wave 7 are available but not deposited e Wave Seven LSYPE Household Grid file November 2011 e Wave Seven LSYPE Activity History file The Household Grid is a hierarchical file containing one row for each individual identified in the household The file contains a total of 45 839 cases representing the 8 682 households who participated in the survey The Activity History file is also hierarchical containing one row for each activity completed by the respondent since the prior interview The LSYPE Household Grid and Activity History files are only available to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk Information on the activities completed by the young person respondent since the prior interview is alternatively available from the LSYPE Monthly Main Activity file for Waves 4 to 7 See Section 5 8 for more information on t
19. response the initial sample was reduced due to some pupils subsequently being found to be ineligible for the study prior to issue and the sample containing a number of duplicates In field tracing of movers Movers were traced using the stable contact address details collected at Wave 1 Where this failed a letter was sent to the head teacher s of the school from which the young person was sampled to locate up to date address details for them In total 13 539 households took part in Wave 2 2 6 Sampling at Wave 3 The survey attempted to follow all the 13 539 households who took part in LSYPE Wave 2 where the child was still alive and living in the UK Of the 13 539 households from Wave 2 a total of 13 525 households were issued at Wave 3 Altogether 14 cases were not issued at Wave 3 and these cases had either refused or opted out prior to Wave 3 In field tracing of movers Movers were traced using the stable contact address details collected at Wave 1 Where this failed a letter was sent to the head teacher s of the school from which the young person was sampled to locate up to date address details for them In addition some movers were traced by using the address details for all Year 11 pupils from the school census In total 12 439 households took part in Wave 3 2 7 Sampling at Wave 4 The survey attempted to follow all the 12 439 households who took part in LSYPE Wave 3 where the child was still alive and living in t
20. sav GET FILE C wave_one_lsype_young_person_file_16 05 08 sav MATCH FILES FILE FILE C wave_one_lsype_parental_attitudes_file_16_ 05 08 sav IBY surveyid SAVE OUTFILE C wave_one_Isype_young_person_and_parental_attitudes _file sav EXECUTE 5 10 Multicoded variables Multicoded variables are obtained from questions where the interviewer is instructed to code all that apply Each response category has a separate variable in the dataset For example the main and second parents education details have been stored within the datasets as multicoded variables therefore if a main parent has answered that they are educated to degree level and have GCSE grades A C then they will have a yes response in both of these separate variables 5 11 Missing values Due to the complexity of the information collected during the survey a number of missing value categories have been adopted These are shown in Box 2 Box 2 Summary of missing values applied to the LSYPE data Valid Missing Values included within published calculated percentages 1 Don t know enables respondents to answer don t know to questions Invalid Missing Values excluded from published calculated percentages 91 Not applicable used to signify that a question did not apply to a respondent usually due to routing 92 Refused used to signify when a respondent has refused to answer a particular question
21. that within a deprivation stratum all pupils within an ethnic group had an equal chance of selection 2 1 Sample exclusions Exclusions were made from the sample which was taken from a school census database supplied by the then Department for Education and Skills DfES Among these exclusions were children educated solely at home and therefore not present on a school roll pupils in schools with fewer than 10 maintained sector or 6 independent sector Year 9 pupils boarders including weekly boarders and children residing in the UK solely for education purposes 2 2 Stage one Sampling schools The first stage of sampling was approached separately for maintained and for independent schools and pupil referral units PRUs In the maintained sector the sample was drawn using the Pupil Level Annual Schools Census PLASC Maintained schools were stratified by deprivation status where deprivation was measured by the proportion of pupils in receipt of free school meals and deprived schools were defined as those in the top quintile of this distribution Within each deprivation stratum school selection probabilities were calculated based on the number of pupils in Year 9 from the six major minority ethnic groups referred to above Within each stratum maintained schools were ordered and thus implicitly stratified by region then by school admissions policy before selection 838 schools were selected in the maintained sector Indepen
22. the Secores variables in the appropriate wave indicates that the second parent is not present in the household However the Household Grid may still contain some details about these people such as their relationship to the young person even if they are no longer living with the young person These parents can be identified by selecting on the variable W2SHGINT 2 or W3SHGINT 2 or W4SHGINT 2 depending on LSYPE wave 5 15 6 LSYPE History file Waves 1 and 2 The History file has been constructed using information collected at Waves 1 and 2 Most of the information was collected at Wave 1 but where the Wave 1 history interview was not completed interviewers attempted to collect most of the information at Wave 2 Where the Wave 1 history interview was completed the Wave 2 history interview was very short covering only the relationship history The content of the history file is summarised in Table 7 This file also includes a number of derived variables which are all outlined in the derived variable documentation These derived variables link respondent information longitudinally not appear as being present in the household twice even if the same person appears twice in the file The variable W5NoUse can be used to identify cases where this appears to be a problem 53 Table 7 Summary of content of LSYPE History file Respondent Main Parent Summary of content Employment Activity History since LAST INTERVIEW Current Activi
23. using the respondent s current or most recent occupation details The Wave 3 4 and 5 interviews only collected the person s current employment detail therefore the NS SEC variables are slightly different and have been given an amended name to identify this the variable name will include a prefix of c to represent this change It should also be noted that at Waves 6 and 7 the NS SEC variables have been derived solely using the responses of the current details of the young person 5 15 Datasets For the purposes of archiving the data it was necessary to remove a number of variables Some of these variables relate to introduction sections within the CAPI programme These variables are asked of the interviewer and therefore are not necessary within the dataset although the question remains in the questionnaire documentation 13 In Wave 5 the current occupation information for the parents in the household were obtained from the responses of the young person 14 If users are interested in looking specifically at the differences please refer to the derived variable guides for all previous waves which include the syntax used to create these variables 45 A number of other variables have been removed from the dataset which might compromise the anonymity of the young person and their families This relates to variables such as the exact date of birth although age at interview and year of birth are available and any answers to
24. 25 minutes for the main sample members and 35 minutes for boost sample members Parental Module This was completed by the main parent The module was comprised of three parts a main parent section asked primarily of boost respondents with a few questions asked of all respondents and two individual parent sections the first relating to the main parent and the second to the main parent s partner if applicable The individual parent sections collected details about the employment education and training and health of each parent If the partner was present at the time of the interview the individual parent partner questions were asked directly of them If not then the main parent was asked to answer on behalf of their 21 partner Overall the parental module lasted approximately 10 minutes for main sample parents and 25 minutes for ethnic boost sample members The following rules were followed to determine who should complete each section and in what order For young people living with parents in the parental home the main parent was to complete the household information module and the parental module The household module had to be completed before the parental module The young person completed the young person module and this could be done before or after the household and parental modules For young people living outside the parental home the young person completed the household information module first followed by the
25. 68 young people issued from the main sample at Wave 4 the survey reached 11 449 households 92 comprising 11 053 full interviews 3 Of the 11 053 full interviews there are 166 households where the YP did not live with a parent 13 89 and 396 partial interviews 3 Partial interviews constituted 196 young people and 202 main parents not being interviewed in 2 cases neither respondent was interviewed Ethnic boost sample at Wave 4 Of the 600 young people issued to the boost sample at Wave 4 the survey reached a total of 352 interviews 59 comprising 309 full interviews 52 and 43 partial interviews 7 Partial interviews were made up of 17 young people and 27 main parents not being interviewed with both respondents not being interviewed in one household Of the 309 full interviews there is no main parent interview for 7 boost households where the young person did not live with a parent 3 5 Response rates at Wave 5 Of the 11 793 young people issued at Wave 5 the survey reached 10 430 households 88 This was made up of 3 832 32 online interviews 5 140 44 telephone interviews and 1 458 12 face to face interviews At Wave 5 only the sampled young person completed the interview 3 6 Response rates at Wave 6 Of the 11 225 young people issued at Wave 6 the survey reached 9 799 households 87 This was made up of 3 803 39 online interviews 4 705 48 telephone interviews and 1 291 13 face
26. It is likely that the characteristics of the non responding pupils were different from those of responding pupils which could lead to biases in estimates of population quantities A statistical model was used to model the differences between those who responded and those who did not This enabled the derivation of non response weights to reduce bias Logistic regression models were used to estimate a pupil s response probability and the non response weights were then calculated as the reciprocal of this estimated response probability The non response weight was combined with the Wave 1 weight W17Finwt to provide the Wave 2 weight W2Finwt Different models were used to estimate the response probabilities of independent and maintained school pupils The sample sizes for the two different groups are shown in Table 14 and the models described in Sections 6 2 1 and 6 2 2 Table 14 Wave 1 and 2 sample sizes Category Wave 1 Wave 2 responders responders Independent 530 456 Maintained 15240 11383 Total 15770 13539 6 2 1 Modelling response from maintained school pupils The Wave 1 data set included 15 240 pupils from maintained schools The vast majority of these 14 674 had provided both Young Person YP and Main Parent MP interviews This information was used in non response weighting The other 566 Wave 1 respondents were only partially productive in Wave 1 missing either a YP or MP interview These provided far less useful
27. LSYPE User Guide to the Datasets Wave 1 to Wave 7 November 2011 Department for Education CONTENTS 1 LINTRODUCTION ciiscissssctuesctescsvasacetavacececaessesainvasscecesastcasssastcasavectsamvesssnianads 5 1 1 Background of the Study sccrc saci Gn ettveet tie sete ae hag eae dase Apsdaay Gl ania 5 1 2 Objectives nirera aa saute basta buedabaadawtedncs ecetanuanseenedieneuennnel 6 2 2 SAMPLE DESIGN siinsessesscessnitaspiinccwesinstiatarsiinenaneicushareiiastiweboeemaanccasaateuas 7 2 1 Sample OXxCIUSIONS ceecceeeceee cece ceeeeeeee cece ened ends REESE EREEREER e renia 7 2 2 Stage one Sampling schools s s cc scc00 cecesesead acne cacaggnsagaa dabensdapecdabaeietced 8 2 3 Stage two Sampling pupils 0 0 ccccceeeceecceeeeeeeeeeeeaeeeeeeeeaeeeaeeaeeeeaeeees 8 24 Sample boosts rn a oon bane daeTensdee a Metter E aaie 9 2 5 Samplingat Wave 2i asied oiri aiee o ARE EA Ei 9 2 6 Sampling at Wave 3 lcicecsccleneactiieeeiearseaniiaseieddninentiesenceese 10 2 7 Sampling at Wave Gee oye la tee laes aa ceicantscisiadnioade iad aopadateasaneastcenpecetabeadadat 10 2 8 Sampling at Wave Dacia icc Gre aaaet ec itiaes And eatin et deis aduaibenatndeetedadsnd 11 2 9 Samplingat Wave Ossis oninia aa a akataa 12 2 10 Samplingat Wave Tiie iyii a i ii aE aa 12 3 RESPONSE RATES siinins niinen anana anaana aaan a daina aaaeaii aa Eia 13 3 1 Response rates at Wave 1 sssnsssssrrrrrrrrrrrrrrrrrrrrrrrrrrrerrre
28. SPint 1 freq W1SPint w1ppressp0a 98 3 missing values w1chpreyp0a compute W1YPint 1 if w1chpreypOa 99 W1YPint 0 variable labels W1YPint W1 Young Person Interview Section conducted value labels W1YPint 0 Young Person interview not conducted 1 Young Person interview conducted execute freq W1YPint w1chpreyp0a 4 missing values wthistphs0Oa compute W1HISTint 1 if w1histphsOa 99 W1HISTint 0 if w1histphsOa 998 W1HISTint 998 if w1histphsOa 995 W1HISTint 995 if w1histphsOa 97 W1HISTint 97 if w1histphsOa 91 W1HISTint 91 variable labels W1HISTint W1 History Interview Section conducted value labels W1HISTint 995 Missing history section unexplained 998 Interviewer missed section on questionnaire 97 Wrong respondent interviewed for history section 91 Not applicable 0 History interview not conducted 1 History interview conducted execute missing values W1HISTint 9 thru 1 freq W1HISTint w1histphsOa 5 missing values W3parentckMP NW3MPint compute NW3SPint 0 if W3parentckMP 1 NW3SPint 1 if NW3MPint 0 and W3parentckMP 1 NW3SPint 91 if W3parentckMP 91 NW3SPint 98 variable labels NW3SPint W3 Whether interview was conducted jointly with second parent value labels NW3SPint 98 SP not present 91 Not applicable no MP interview completed 0 Interview conducted solely by MP 1 Interview conducted jointly wit
29. TEduYP Unspecified routing problem led some Responses coded 997 w7NEETWrk2YP respondents to incorrectly miss these questions Script error w7AppEnableYP w7AppBensYP w NEETDifOYP w7 BullyDiscriminationYP w BullyConsiderYP w 7NumChiYP w NEETMainJ w7OwnChi2 w7QuaWageYP A small number of responses to this question Responses coded 998 were missing from the final dataset Missing data Question not asked w7AlcEverYP A small number of respondents were not asked Responses coded 996 this question due to a problem with a variable Problem with feed forward from a prior wave which should have been variable considered in routing respondents to this question 84 Variable section Problem identified Action taken W7SexSafeOftenYP These multicoded variables include a third None W7SexSafeOftenOYP response Not applicable respondent defined W7SexSafeOftenO2YP_ Therefore these variables have three categories 1 Yes 2 No 3 N a respondent defined rather than the two that would normally be expected with binary multicoded variables w7BenftsYP0e Despite correct routing being present in the As Child Benefit is not questionnaire this question was incorrectly means tested it is a fair asked only of respondents who declared their assumption that all the children at this wave This meant that young people who were respondents who mentioned their children at claiming it at Wave 6 would W
30. This approach will allow existing published sources to be matched as closely as possible The Activity History files for Waves 4 5 6 and 7 are only available to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk 5 15 5 LSYPE Household Grid The LSYPE Household Grid files are not deposited and are only available to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk The Household Grid files contain two types of information individual identifiers and identifying characteristics e g the person number of each respondent their sex and age and cross sectional variables collected about everyone in the household e g relationships between household members At Waves 5 6 and 7 considerably less information was gathered in the Household Grid and the variable containing information about the person s relation to the young person e g W5Relation was less detailed than in previous waves Table 6 indicates the information available from the Household Grid 51 Table 6 Summary of content of LSYPE Household Grid Summary of conten
31. Wave 7 In field tracing of movers Movers were traced using the stable contact address details collected at the previous waves where these were available In total 8 682 households took part in Wave 7 12 3 RESPONSE RATES The achieved response rates for LSYPE Waves 1 to 7 are discussed in detail in Sections 3 1 to 3 7 and summarised in Table 1 Table 1 Achieved response rates for LSYPE Waves 1 to 7 Wave 4 Wave1 Wave2 Wave3 Wave4 Boost Wave5 Wave6 Wave 7 Issued sample 21 000 15 678 13 525 12 468 600 11 793 11 225 9 791 Achieved sample 13 539 12 439 11 449 352 10 430 9 799 8 682 3 1 Response rates at Wave 1 Of the 21 000 young people sampled at Wave 1 the survey reached 15 770 households 74 This comprised 13 914 full interviews 66 and 1 856 partial interviews 9 3 2 Response rates at Wave 2 Of the 15 678 young people issued at Wave 2 the survey reached 13 539 households 86 comprising 11 952 full interviews 76 and 1 587 partial interviews 10 As at Wave 1 the majority of partial interviews were cases where the second adult was not interviewed 3 3 Response rates at Wave 3 Of the 13 525 young people issued at Wave 3 the survey reached 12 439 households 92 comprising 12 148 full interviews 90 and 291 partial interviews 2 Partial interviews constituted 145 young people and 146 main parents not being interviewed 3 4 Response rates at Wave 4 Of the 12 4
32. able the multicoded 42 5 13 Variable labels The variable labels included on the dataset were initially derived from the CAPI program These have been reviewed in an effort to ensure consistency across waves and to clearly identify who the question was asked of and who the question related to In order to enhance the variable labels the labels now include prefixes within the label to indicate the following e who the question was asked of i e MP SP YP e whether the variable was a survey administration variable and e whether the variable was a derived variable The variable labels now use the following list of prefixes to clearly identify the source of the question HH Household Section Interview MP Main Parent Interview SP Second Parent Interview SP MP Second Parent information asked of either MP or SP YP Young Person Interview HR History Section DV Derived Variable this clearly identifies that this is a derived variable ADMIN Administrative data this identifies when the question relates to the interviewer for example coding whether the self completion section was completed Using MP SP or YP as a prefix clearly identifies that the question was directly asked to that person for example variable to go from Oa through to zz 0 representing the first 26 categories of answers and z representing the 27 grouping of 26 categories 11 Used in Wave 4 to denote questions about the SP that could be a
33. ables are derived using the wave specific and interview specific i e main second history respondent identifier Box 6 Syntax for Index file derived variables 7 to 9 7 compute mainint 2 if cw1mainres gt 0 and cw2mainres gt 0 and cw3mainres gt 0 and cw1mainres cw2mainres and cw2mainres cw3mainres mainint 0 if cw1mainres gt 0 and cw2mainres gt 0 and cw3mainres gt 0 and cw1mainres cw2mainres cw2mainres cw3mainres mainint 1 if cw1mainres gt 0 and cw2mainres gt 0 and cw3mainres lt 0O and cw1mainres cw2mainres mainint 0 if cw1mainres gt 0 and cw2mainres gt 0 and cw3mainres lt 0 and cw1mainres cw2mainres mainint 1 if cw1mainres gt 0 and cw2mainres lt 0 and cw3mainres lt 0 mainint 0 if cw1mainres gt 0 and cw2mainres lt 0 and cw3mainres gt 0 and cw1mainres cw3mainres mainint 0 if cw1mainres gt 0 and cw2mainres lt 0 and cw3mainres gt 0 and cw1mainres cw3mainres mainint 1 if cw1mainres lt 0 and cw2mainres gt 0 and cw3mainres gt 0 and cw2mainres cw3mainres mainint 0 if Ccwimainres lt O and cw2mainres gt 0 and cw3mainres gt 0 and cw2mainres cw3mainres mainint 1 if cw1mainres lt 0 and cw2mainres gt 0 and cw3mainres lt 0 mainint 0 if cw1mainres lt 0 and cw2mainres lt 0 and cw3mainres gt 0 mainint 0 if cw1mainres lt 0 and cw2mainres lt 0 and cw3mainres lt 0 mainint 98 if switch 1 mainint 91 if mainint 5 and wimainres 994 and w2mainres 994 and w3mainr
34. ailable to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http Awww esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk 5 4 Wave 4 deposited data Three data files have been deposited for Wave 4 based on the information collected from the young person main parent and the second parent The following three files have been deposited as cross sectional data files e Wave Four LSYPE Family Background file September 2009 e Wave Four LSYPE Parental Attitudes file June 2009 e Wave Four LSYPE Young Person file September 2009 Two further files for Wave 4 are available but not deposited e Wave Four LSYPE Household Grid file June 2009 e Wave Four LSYPE Activity History file The Household Grid is a hierarchical file containing one row for each individual identified in the household This file contains a total of 55 856 cases representing the 11 801 households who participated in the survey The Activity History file is also hierarchical containing one row for each activity completed by the respondent since the prior interview The LSYPE Household Grid and Activity History files are only available to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive 34 http www esds ac uk findin
35. an 1 82 Caribbean 1 45 Mixed 2 26 6 4 2 Boost cohort Wave 4 For the boost cohort entering the study at Wave 4 the weight to account for the probability of being sampled to take part in the survey was the same as that assigned at the sampling stage at Wave 1 The design weight was applied and the non response weights calculated using CHAID to determine a probability of response which was then inverted to give a non response weight The variables used in the CHAID were those that were available on the administrative data 6 4 3 Combining main and boost The initial design weights at Wave 1 were trimmed and scaled This had to be accounted for when merging the main and boost files so that the main and boost were each correctly represented in the combined file Different factors 69 were applied to the main cohort and the boost cohort that adjusted for this based on the design weights at the initial sampling stage When combining the file the design and non response weights were applied to each of the main and boost files separately Population weights were then applied so that the profile of the combined file matched the same population profile shown in Figure 1 Figure 1 Combining main and boost weights Eligible Population PLASC 2004 Issued sample PLASC 2004 Non cooperating schools Longitudinal weight Design amp n r design amp n r weighted boost LSYPE main amp boost with population weight
36. and tertiary education or training to economic roles in early adulthood to enhance the ability to monitor and evaluate the effects of existing policy and provide a strong information base for future policy development to contextualise the implementation of new policies in terms of young people s current lives 2 SAMPLE DESIGN The original sample drawn for LSYPE was comprised of over 33 000 young people in Year 9 attending maintained schools independent schools and pupil referral units in England on February 2004 The final issued sample for Wave 1 was approximately 21 000 young people All sample members were those born between 1 September 1989 and 31 August 1990 For the maintained sector LSYPE adopted a two stage probability proportional to size PPS sampling procedure with disproportionate stratification Schools were primary sampling units PSUs Maintained schools were stratified into deprived non deprived with deprived schools over sampled by a factor of 1 5 The second stage sampled the pupils within schools Pupils from major minority ethnic groups Indian Pakistani Bangladeshi Black African Black Caribbean and Mixed were over sampled at pupil level in order to achieve target issued sample numbers of 1 000 in each group The school sampling stage took into account the number of pupils from each of these minority groups Taken together the school selection probabilities and the pupil selection probabilities ensured
37. as interviewed In addition where factual discrepancies occur between information collected in consecutive waves of the study information collected at the interview closest to when the activity took place takes precedent If the Activity History and Monthly Main Activity files are used in parallel differences such as these will need to be taken into consideration Source data relating to the Activity History section of the questionnaire at Waves 4 5 6 and 7 are available from DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk 5 9 How to link the datasets All of the datasets have a unique serial number and each file can be linked on the variable surveyid This serial number is unique to the cohort member and therefore each family It is important that each file is sorted by surveyid in ascending order to link the datasets A typical SPSS command to link files is shown in Box 1 Box 1 Merging the datasets together GET FILE C wave_one_lsype_young_person_file_16_ 05 08 sav Sort cases by surveyid A SAVE OUTFILE C wave_one_Isype_young_person_file_16 05 08 sav GET FILE C wave_one_lIsype_parental_attitudes_file_16 05 08 sav Sort cases by surveyid A 39 SAVE OUTFILE C wave_one_lIsype_parental_attitudes_file_16 05 08
38. asets or specific individual level datasets it will be necessary to include some of the survey level variables described in Section 10 2 to facilitate correct linkage Box 4 below provides some specific examples of how to use the Index file to link between the available deposited datasets Box 4 Merging with the Index file Example 1 Matching one variable from the Index file onto a longitudinal dataset When creating a longitudinal young person dataset that also accounts for survey response the following syntax highlights how to use the Index file Firstly create the longitudinal dataset of interest GET FILE C TEMP Wave One LSYPE Young Person File sav Sort cases by surveyid A SAVE OUTFILE C TEMP Wave One LSYPE Young Person File sav GET FILE C TEMP Wave Two LSYPE Young Person File sav Sort cases by surveyid A SAVE OUTFILE C TEMP Wave Two LSYPE Young Person File sav GET FILE C TEMP Wave One LSYPE Young Person File sav MATCH FILES FILE file C TEMP Wave Two LSYPE Young Person File sav By surveyid EXECUTE Save OUTFILE C TEMP Wave One amp Two LSYPE Young Person File sav GET FILE C Index File sav 95 Sort cases by surveyid A Select if HHID 1 SAVE OUTFILE C Longitudinal Index File sav keep surveyid W1HHresp to W6HHresp GET file C TEMP Longitudinal Index File sav MATCH FILES FILE file C TEMP Wave One amp Two LSYPE Y
39. at the beginning of fieldwork As a result of this there are 35 cases who are missing from NEETStat and NEET12 who should have been asked these questions Action taken The cases have been coded to 997 Script error W4TrainingYP The filter in the Word questionnaire for the Training question is wrong The filter before Training in the Word was as follows Mainact 2 OR Mainact2 3 amp jobcol 2 OR Jobcol 1 amp examchk 2 The problem with this filter is that respondents who are in part time employment mainact2 3 never get asked jobcol and therefore can never qualify for the filter The BMRB MORI script didn t follow the word questionnaire and instead used the following filter Mainact2 3 amp examchk 2 OR Mainact 2 amp jobcol 2 OR jobcol 1 amp examchk 2 The NOP script did use the filter in the Word questionnaire and as a result 113 respondents were not asked the Training question None W4HE1YP0a At some point during the post pilot drafting for YCS None Users should note this W4HE1YPO0Ob and Lsype W4 the word don t failed to be deleted when looking at either or both as intended from the Lsype Word questionnaire of the variables However as the BMRB script ran off the YCS version their CAPI programme matched the YCS Word questionnaire as intended The NOP script matched the mistaken Lsype word questionnaire W4HEDecnYP There are a small number of cases max 3 who None W4Ben
40. ation application status at Wave 6 e Whether tried Cannabis by Wave 6 interview e Whether had sex and at what age by Wave 6 interview To obtain the final non response weights the inverse of the probability of response was taken and multiplied by the design weight to achieve the final non response weight This was then trimmed at the 2nd and 98th percentile to remove any extreme weights and then scaled to the achieved sample size Checks were done by comparing various statements for respondents who took part in Wave 7 with the Wave 7 weight applied against those who took part in Wave 6 with the Wave 6 weight applied and assessed for similarities None of the estimates looked at were considered to be out of an acceptable range of similarity 74 No population weights were applied at Wave 7 as it was no longer possible to identify the up to date characteristics of the eligible population 6 8 Weights to use in analysis Every LSYPE wave with deposited data has an accompanying weight which is appropriate for analysis contained within a single wave Where caution must be applied is where variables from multiple waves are used in a single piece of analysis If this is the case the general rule is always to use the weight from the most recent wave that a variable has been taken from For example if a cross tabulation of attitude to school at Wave 1 by higher education application status at Wave 6 is completed then a weight from Wave 6 is
41. ave 6 and had no further children to mention still be claiming it at Wave at Wave 7 were not routed to this question 7 To enforce that assumption the variable w7 BenftsYP0e was removed from the dataset and replaced by the derived variable w7BenftsOe w 7UnEmBenYP Some respondents missed the opportunity to Responses coded 997 w7 IncSupYP declare benefits in w7BenftsYP and instead Script error Free text w 7SkDsBnYP chose to mention them in the non dataset response not routed to w7FamilyYP question BenftsO The responses to this supplementary question w CCTCYP question were back coded into w7BenftsYP and should have been used in the routing to w7UnEmBenYP w7 IncSupYP w7SkDsBnYP w7 FamilyYP and w7CCTCYP Unfortunately this last routing did not correctly function and so a small number of cases missed these follow up questions 85 9 APPENDIX B UPDATES TO WAVE 1 AND WAVE 2 Tables 28 and 29 below highlight some variable name changes that have been made on the data since archiving These changes form part of an overall update to Waves 1 and 2 which took place in May 2008 Table 28 Updates to Wave 1 data VEEL i Problem identified Action taken section W1incareHH W1InCarHH W1intypeHH These variables have been renamed to ensure consistency across W1InTypHH WievercarMP0i waves W1whencarMP W1scomad2HS W1scomadi2HS Table 29 Updates to Wave 2 data Variable sect
42. ble the weighting process will increase the random error in the estimates thus reducing their precision The effect the weights have on precision can be measured by their efficiency or by the design effect essentially the reciprocal of the efficiency Table 13 shows the design effect and its breakdown Table 13 Design Effect Stage of weighting Design effect Selection weighting after trimming 1 250 Final weighting 1 276 This shows the design has an efficiency of 1 1 276 78 4 The interpretation is that a simple random sample 78 as large as the achieved sample would give equally accurate estimates of national quantities This is mainly due to the selection weighting selection weighting accounted for a design effect of 1 25 but the non response weighting and grossing to match population proportions increased the design effect by only 1 276 1 250 1 02 6 2 Wave 2 weights This section explains how the data was weighted to account for the non response between Waves 1 and 2 A design weight was provided by the fieldwork consortium this variable is called designweight and is available in the Wave 1 dataset This is the reciprocal of the pupil s selection probability scaled so that the weighted and unweighted achieved sample sizes were equal 59 There were a total of 15 770 productive or partially productive interviews in Wave 1 of which 15 678 were issued at Wave 2 Some of these did not respond in Wave 2
43. d so that as far as possible any mode effects were mitigated In order to ensure that the online and face to face versions matched the telephone interview nearly all prompted list based questions used an active format where respondents had to give a response to each item For online and face to face 28 interviewing list based questions usually use a passive format where a list is shown to a respondent and they only identify the items that apply to them This approach is generally adopted as it significantly lessens respondent burden but comparability with the telephone interviews meant that the active approach had to be used As a result of the need to use an active format questions with prompted lists were kept to a minimum and where they had to be used they included as few items as possible One version of the questionnaire was created covering online telephone and face to face interviewing with mode specific interview instructions This allowed virtually all variables to be constructed in the same format to allow for combined data analysis 4 8 Questionnaire Modules The questionnaire is split into five sections for the Wave 1 Wave 2 and Wave 4 interviews These include separate sections covering the household main parent individual parents young person and history The individual parent section is asked of both the main and second parent if a second parent is available At Wave 3 the questionnaire did not include the histor
44. dent schools and PRUs were sampled using the school level annual schools census SLASC Independent schools were stratified by percentage of pupils achieving 5 or more A C GCSE grades in 2003 within boarding status i e whether or not had any boarding pupils within gender of pupils i e boys girls and mixed PRUs formed a stratum of their own Independent schools and PRUs were sampled with probability proportional to the number of pupils aged 13 at that institution 52 independent schools and 2 PRUs were sampled in this way 2 3 Stage two Sampling pupils Within the maintained sector pupils were sampled from PLASC Pupil selection probabilities were dependent on ethnic group as recorded in PLASC and on school selection probabilities The average number of pupils sampled per school was 33 25 although the number sampled per school varied according to the ethnic group composition of the school population Parental and address details were not available from Pupil Level Annual School Census PLASC returns before 2006 Therefore interviewers visited schools to collect the address details of the pupils sampled for the study Pupils in independent schools and PRUs were sampled directly from school rolls by interviewers using a sampling program installed on their laptop computers 33 or 34 Year 9 pupils 33 25 on average were randomly selected at each independent school PRU containing 34 or more year 9 pupils All the Year 9 pupils were s
45. e e Education Maintenance Allowance EMA _ l e e e Jobs and training O o o S S e e e o Apprenticeships CT o T S S e oe NEET ooo O o S d e e e o Volunteering CC TS T d e Voting Cw o Caretolearn o o o ooo o o T S e e Attitudes on localarea To To T S e f Income and benefits CT e e e oe Section included o Section asked of boost respondents only 49 5 15 4 LSYPE Activity History file and Monthly Main Activity file Data relating to the Activity History of young people are collected through a loop of questions asked at Waves 4 5 6 and 7 These questions look to record every activity that occupies the majority of the young person respondent s time at any given point between September 2006 two months after completion of compulsory education and the Wave 7 interview Typical information collected through the Activity History includes the following e Type of activity categorised as one of fourteen categories e Start end date of activity e Whether courses were completed e Reasons for activity transition e Whether illness disability influenced change in activity e Why periods of employment came to an end e Activities being completed whilst unemployed e Whether training accompanies periods of employment at Wave 7 only The responses to the questions have been used to create the Activity History file at each wave which represents the activities t
46. e Wave 3 4 5 6 7 Main eects ee section Special educational needs ss sid Relationship with young person and contact with services Reasons for not living with natural parents e e Risk factors absences truancy police contact bullying School History d e f o T T S Household responsibilities _ e e e f D J Household resources e e e e Demographics CT e e o o o o o Health and disability S e e e e Attitudes to schoolteachers e e e e e J Year 10 subject choices and reasons _ e e J Rules and discipline e J To T T S Homework wT ef J D f CU e i a E E e E Study Support e e J e _ J Future plans and advice e e e D J Information advice and guidance e e e Relations with parents e J T T T f a e drugs Household responsibilities ee e ee ey ee eee eee Childcare and caring responsibilities Ee Ee ee ee ee ee 8 of elsure Ue s fs Subjects being studied s Subjects being studied ____ o la aed td Apprenticeships and related schemes Current activities Current activities Sensors eae a a S ee Attitudestohighereducation l 1 e e e oe Attitudes to debt o o o o o S d e e e Attitudes to work o o d e T e e Higher education o Z o d OT d T e o Potential higher education students l l
47. e child For example at Wave 2 the mother may have answered the main parent questionnaire and the father the second parent but at Wave 3 these roles could have reversed Cases where this occurred can be identified by comparing the positions in the Household Grid at previous waves 4 4 Fieldwork at Wave 4 Wave 4 fieldwork ran from 12 June 2007 to 14 October 2007 Wherever possible interviewers were assigned to the same households they had interviewed at Wave 3 All interviewers were briefed via face to face briefings Prior to the fieldwork commencing a website was set up for survey respondents in March 2007 This website contained information about the study for respondents such as why it was set up and detailing some of the findings It also allowed respondents to update their contact details if they had moved and to give feedback about the study A keep in touch exercise was carried out consisting of a letter to all respondents who were going to be contacted for Wave 4 with the exception 7 For users interested in identifying who answered the individual parent questions the variable W3parentckMP is available on the Wave 3 Family Background File 19 of the ethnic boost respondents The letter thanked them for their help with the study so far informed them about the new study website and let them know that they would be contacted later in the summer for the next interview A colour leaflet called Next Steps News
48. e left untrimmed but the top and bottom 0 5 of the maintained weights were trimmed and then re scaled The final non response weights ranged from 0 93 to 1 43 SD 0 079 with the percentiles shown in Table 17 These were combined with the Wave 2 weights to create 66 the Wave 3 weights After scaling to have a mean of 1 SD 0 533 the final Wave 3 weight percentiles described in Table 17 were obtained Table 17 Percentiles Percentile Non response_ Final Wave 3 weight weight The non response weighting does not create a large loss of effective sample size This is partially as the high response rate leads to low variability in the weights but also because the non response weights are negatively correlated with the Wave 2 weights 6 4 Wave 4 weights There were several stages to the Wave 4 weighting Weighting was conducted for the main survey the boost survey and a combined main and boost survey Both the main and boost cohorts incorporated weights accounting for the probability of being sampled to take part and a weight accounting for non response Finally the main and combined files were each weighted to the population 67 6 4 1 Main cohort Wave 4 The weights for this part of the sample incorporated weights which account for the probability of being in the sample at Wave 4 weights to account for non response and finally a population weight which ensured the profile of the sample was consistent with the prof
49. e sav GET FILE C TEMP Wave One LSYPE Family Background File sav MATCH FILES FILE file C TEMP Wave Two LSYPE Family Background File sav By surveyid 96 EXECUTE Save OUTFILE C TEMP Wave One amp Two LSYPE Family Background File sav The longitudinal version of the family background file will still hold a maximum of 15 770 cases as this will refer to the young person structure When matching these details back onto the Index file it is important to use the table function which will then ensure that a main parent only dataset has been created GET file C TEMP Longitudinal Index File mothers only sav MATCH FILES FILE TABLE C TEMP Wave One amp Two LSYPE Family Background File sav By surveyid EXECUTE This file will now include the Wave 1 and Wave 2 family background details for main parents who remained the same respondent across both waves When creating a main parent file regardless of whether the main parent remained the same respondent across waves it is merely need to adapt the control syntax used Likewise this syntax could be adapted to create other analysis specific datasets such as a father only dataset 97 11 APPENDIX D DERIVED VARIABLE GUIDE TO THE INDEX FILE This chapter provides the syntax used to create various derived variables provided in the Index file A Variables indicating interview completion Variables 1 to 4 in Box 5 identify whether the
50. eeds SEN status School Level Data This contains information about the school each sample member attended at the sampling stage and where we have linked to NPD information about the primary school attended by the young person at Key Stage 2 Geographical Data Data from the National Statistics Postcode Directory NSPD have been linked by postcode A small number of non disclosive variables are included in the family background files for each wave Due to the potentially disclosive nature of some of these variables the main linked administrative data described above have not been included on the 77 deposited LSYPE files A reduced version has been deposited and includes information relating to number of school moves free school meal eligibility SEN and Key Stage 2 and 3 results Researchers requiring access to the fuller linked administrative files should contact DfE directly Data are only available to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk It should be noted that later waves of LSYPE have included consent questions to enable linking between LSYPE data and data held by the Department for Work and Pensions and the Department for Business Innovation and Skills Whilst consent has been obtained for this additi
51. eeeeseeeseeeseeeseeeseeeseeseeeees 29 5 SURVEY CONTEND ccsicssissesccvccsscccndcasensv vaaccsndatsennsdvaveasececesunsdeavenexdeaecaus 32 5 1 Wave 1 deposited data ccccccccccecceccceecceecceecceeeceececeeceesesesseeeeseenaees 32 5 2 Wave 2 deposited data ccccccccccccceccecccecccececeeeeeeeseeeeeeeeeeeeeeeeeneeeees 33 5 3 Wave 3 deposited data cccccccccccccccceccceeeceeeceeeeeeeeeeeeeeeeeeeseeeeeeenenes 33 5 4 Wave 4 deposited data ccccccccccccccccecccececeeeceeeeeeeeeeseeeeeeeeeeeeeeeenenes 34 5 5 Wave 5 deposited data cccccccccccccccccecccececececeeeeeeeeeeeeeeeeeeeseeeeeeeneees 35 5 6 Wave 6 deposited data cccccccccccccccccccceeccececeeeeeeeeeeeeeeeeeeeeeeeeseeseess 36 5 7 Wave 7 deposited data cccccccccccccccccecccececececeeecececeeceeeseseceeeeseeesaees 37 5 8 Monthly Main Activity deposited data ee eeeeeeeeeeeeeeeeeeeeenaaees 37 5 9 How to link the datasets ccccccccccccceccccecceeeeeeeeeeeeeaeeeceeeeeeeeeeeeaeens 39 5 10 Multicoded variables cccccccccceccccccceccceeeceeeceeeeeeeeeeeeeeeeeeeeeeeeseeenaes 40 5i MISSING WANUSS za iid aaea aa a dag cececgeeass acacia celeeiiasanintss 40 512 Natlable MaMes voccvecscrales cvs lantaeciasleccsttne dc ancaneseaueaeaetechuadsheadaees avec a 41 5 13 Variable label Sienen lacs tancel cotencanad dae cahds ced e tetas dewaadas 43 SIA gt HALEY CISA ests si ela caetinc un
52. efits_1 W4Benefits_18 have given responses to these variables when routing checks indicate that they should not have W4Costs_1 done Their information has been retained in the W4Costs_18 relevant variables W4HeCon_1 to W4HeCon_24 W4Debtatt1 W4Debtatt6 Second parent BMRB MORI only At the beginning of fieldwork None Any missing data will questions the BMRB script was not identifying the presence be coded appropriately of second parents in the household This could not 80 Variable section Problem identified Action taken be picked up at the testing stage as it was something that only affected the live script version not the practice version that is used for testing Most of this information was retrieved through re interviewing W4KS4check1YP At the beginning of fieldwork the NOP script was None not pointing correctly at the sample column holding the information about the young person s attainment Most of this information was retrieved through re interviewing W4AlevuniYP For NOP cases the filter for this question was A flag variable incorrectly placed within a previous filter which W4AlevflagYP has been meant that although criteria were sometimes created to indicate whether correct a lot of respondents did not get the chance the YP should have been to meet those criteria asked the question StemAtt to Some NOP cases miss these questions due to an Coded a
53. elected in the independent schools PRUs that contained fewer than 34 but more than 5 Year 9 pupils Of the 892 schools selected in total 647 schools 73 co operated with the study School level non response was a specific problem especially in Inner London and in the independent sector where only 56 and 57 per cent of schools responded respectively Therefore the final issued sample was much smaller than the initial sample drawn from PLASC 2 4 Sample boosts As noted previously the sample also included boosts These boosts were in order to ensure an adequate representation of the relevant sub populations in England These sample boosts included the 20 of schools with the most pupils in receipt of Free School Meals and therefore pupils in these schools Ethnic minority pupils were over sampled at pupil level in the maintained sector using the PPS design This method is a contrast to methods used in many other studies where numbers are boosted by over sampling PSUs containing relatively high numbers of the groups of interest 2 5 Sampling at Wave 2 The survey attempted to follow all the 15 770 households who took part in LSYPE Wave 1 where the child was still alive and living in the UK Of the 15 770 households from Wave 1 a total of 15 678 households were issued at Wave 2 Altogether 92 cases were not issued at Wave 2 as 79 cases had refused or opted out prior to Wave 2 and 13 cases had moved abroad In addition to school non
54. ents guardians of the young people which included a fact sheet that gave specific details regarding why the parents were being contacted and the purpose of the third wave Both letters advised that the interviewer would be calling at their address following receipt of the letter The young person s letter also included an unconditional 5 gift voucher incentive In total the interview consisted of four modules The sample member completed one module the young person interview which lasted approximately 20 minutes Adult interviews were also completed for household information the main parent interview and individual parent interviews Unlike at previous waves in households with 2 parents there was 18 no second parent interview The main parent answered the individual parent questions on behalf of both parents These lasted approximately 15 minutes altogether The total target interview time was 35 minutes As at previous waves the main parent was defined as the parent most involved in the young person s education It was also possible that by Wave 3 some of the young people no longer lived with their parents In total there were 15 cases with these young people answering some of the main parent questions within the young person module It should be noted that the main parent and second parent roles at previous waves were not necessarily carried forward to Wave 3 even in cases where the parent s were still living with th
55. erson was not currently living in the parental home These rules see Section 4 4 were again applied at Wave 6 The Wave 6 interview consisted of two modules e Household Information Module The young person answered questions about their household situation and gave details of any persons living with them e Young Person Module This was completed by the sample member The full questionnaire comprising both sections was designed to take 25 minutes to complete In line with Wave 5 the mix of data collection methods used at Wave 6 required the design of the questionnaire to be carefully considered so that as far as possible any mode effects were mitigated In order to ensure that the online and face to face versions matched the telephone interview nearly all prompted list based questions used an active format where respondents had to give a response to each item For online and face to face interviewing list based questions usually use a passive format where a list is shown to a respondent and they only identify the items that apply to them This approach 26 is generally adopted as it significantly lessens respondent burden but comparability with the telephone interviews meant that the active approach had to be used As a result of the need to use an active format questions with prompted lists were kept to a minimum and where they had to be used they included as few items as possible This approach was felt to be too unwield
56. es 994 mainint 0 value labels mainint 98 No MP Interview 91 Not Applicable person switched position within household across waves 0 No change in MP person number 1 MP has changed between waves exe 100 8 compute secint 2 if cw1secores gt 0 and cw2secores gt 0 and cw3secores gt 0 and cw1secores cw2secores and cw2secores cw3secores secint 0 if cw1secores gt 0 and cw2secores gt 0 and cw3secores gt 0 and cw1secores cw2secores cw2secores cw3secores secint 1 if cw1secores gt 0 and cw2secores gt 0 and cw3secores lt 0 and cw1secores cw2secores secint 0 if cw1secores gt 0 and cw2secores gt 0 and cw3secores lt 0 and cw1secores cw2secores secint 1 if cw1secores gt 0 and cw2secores lt 0 and cw3secores lt 0 secint 0 if cwisecores gt 0 and cw2secores lt 0 and cw3secores gt 0 and cw1secores cw3secores secint 0 if cw1secores gt 0 and cw2secores lt 0 and cw3secores gt 0 and cw1secores cw3secores secint 1 if cw1secores lt 0 and cw2secores gt 0 and cw3secores gt 0 and cw2secores cw3secores secint 0 if cw1secores lt 0 and cw2secores gt 0 and cw3secores gt 0 and cw2secores cw3secores secint 1 if cw1secores lt 0 and cw2secores gt 0 and cw3secores lt 0 secint 0 if cw1secores lt 0 and cw2secores lt 0 and cw3secores gt 0 secint 0 if cw1secores lt 0 and cw2secores lt 0 and cw3secores lt 0 secint 98 if switch 1 secint 91 i
57. ess details collected at the previous waves where these were available Due to the age of the young people at Wave 5 it was no longer possible to use school data to provide new addresses for households who had moved and could not be traced In total 10 430 households took part in Wave 5 2 9 Sampling at Wave 6 All 11 793 households issued at Wave 5 were considered for surveying again at Wave 6 From this 11 225 households were issued at Wave 6 with the following exclusions accounting for the 568 lost cases e 330 cases where the respondent refused to take part at Wave 5 e 4 cases where the respondent was physically or mentally unable to take part in Wave 5 e 196 cases where the respondent had moved and was un traceable at Wave 5 e 4cases where the respondent had died prior to Wave 5 e 28 cases where the respondent took part in Wave 5 but refused to be re contacted for future waves e 4cases where the respondent had died since taking part in Wave 5 e 2 cases where the respondent informed the research team they were unavailable to take part in Wave 6 In field tracing of movers Movers were traced using the stable contact address details collected at the previous waves where these were available In total 9 799 households took part in Wave 6 2 10 Sampling at Wave 7 The Wave 7 sample consisted of all young people who had been interviewed at Wave 6 and who agreed to be re contacted In total 9 791 cases were issued at
58. f secint 5 and w1secores 994 and w2secores 994 and w3secores 994 secint 0 value labels secint 98 No SP Interview 91 Not Applicable person switched position within household across waves 0 No change in SP person number 1 SP has changed between waves exe 9 compute HISint 2 if Cwthistres gt 0 and Cw2histres gt 0 and Cw1histres Cw2histres HISint 0 if Cwthistres gt 0 and Cw2histres gt 0 and Cw1histres Cw2histres HISint 1 if Cw1histres gt 0 and Cw2histres lt 0 HISint 0 if Cwthistres lt 0 and Cw2histres gt 0 HISint 0 if Cwthistres lt 0 and Cw2histres lt 0 HISint 98 if switch 1 HISint 91 if hisint 5 and wthistres 994 hisint 994 value labels HISint 994 New member of household at Wave four inc boost cases 98 No HISTORY Interview 91 Not Applicable person switched position within household across waves 0 No change in HISTORY person number 1 HISTORY person number has changed between waves exe C Variable indicating Household Grid completion Variable 10 in Box 7 has been derived to identify a longitudinal level response across Waves 1 to 4 This variable is based on whether the Household Grid has been completed at each specific wave therefore if a Household Grid is only available at Wave 1 the variable resps will indicate that the longitudinal level response is Wave 1 only 101 Box 7 Syntax for Index file derived variable 10 10 compute res
59. gData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk Information on the activities completed by the young person respondent since the prior interview is alternatively available from the LSYPE Monthly Main Activity file for Waves 4 to 7 See Section 5 8 for more information on this file 5 5 Wave 5 deposited data Two data files have been deposited for Wave 5 based on the information collected from the young person The following files have been deposited as cross sectional data files e Wave Five LSYPE Family Background file March 2010 e Wave Five LSYPE Young Person file March 2010 Two further files for Wave 5 are available but not deposited e Wave Five LSYPE Household Grid file March 2010 e Wave Five LSYPE Activity History file The Household Grid is deposited as a hierarchical file containing one row for each individual identified in the household The file contains a total of 51 121 cases representing the 10 430 households who participated in the survey The Activity History file is also hierarchical containing one row for each activity completed by the respondent since the prior interview The LSYPE Household Grid and Activity History files are only available to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team
60. h MP and SP execute freq NW3SPint W3parentckMP 6 recode w4sourcesp sysmis 5 else copy missing values w4sourcesp compute W4SPint 1 if w4sourcesp 99 W4SPint 0 if w4sourcesp 98 W4SPint 91 if w4sourcesp 997 w4spint 997 if w4sourcesp 91 w4spint 91 if w4sourcesp gt 0 w4spint w4sourcesp if w4hhid 973 w4SPint 973 if w4hhid 970 w4Spint 970 if w4hhid 971 w4Spint 971 if w4hhid 972 w4Spint 972 if w4hhid 974 w4Spint 974 variable labels W4SPint W4 Second Parent Interview Section completed value labels W4SPint 99 997 Script Error 996 No Parent in household 970 Response in W1 only 971 Response in W2 only 972 Response in W3 only 973 Response in W1 and W2 only 974 Response in W1 and W2 and W3 only 91 Not applicable second parent not present in HH 0 Second Parent interview not completed 1 Second Parent interview completed by second parent 2 Second Parent interview completed by main parent or other adult 3 Second Parent interview completed by main parent with consultation from second parent execute B Variables indicating change in main second history parent role Variables 7 to 9 in Box 6 identify whether the main second history parent person number changed at any point across the first three waves These variables identify whether the respondent was the same person at each wave or whether this respondent changed These vari
61. hat took place between interviews All resulting Activity History files are therefore one line per activity with respondents who engaged in multiple activities having multiple lines Those who did not change activity between interviews will not be included in the Activity History file their main activity information is picked up from the Young Person file using the variables relating to the activity at the time of interview Activity History files for Waves 4 5 6 and 7 have not been deposited however an alternative Monthly Main Activity file is available This file synthesises the Activity History information from all four waves and derives the main activity for each month from September 2006 to May 2010 This activity is summarised to be either Education Employment Apprenticeship Training or Unemployed Inactive NEET Where information was missing in the Activity History file the derivation for the main activity 50 either suppresses information for the months affected or where a start end date is missing randomly imputes a start end date in the appropriate date window Where it is required to perform analysis based on the activity of the young person respondent it is recommended that data from the Monthly Main Activity file is used This is best practice as it removes inaccuracies that are introduced by using the current activity at the time of interview as interviews were conducted over a six month period
62. he UK Of the 12 439 households from Wave 3 12 410 households were issued at Wave 4 Altogether 29 cases were not issued at Wave 4 with these cases having either refused or opted out following Wave 3 In addition to this 4 households who had previously refused to take part at either Wave 2 or Wave 3 asked to be re included and were issued at Wave 4 In field tracing of movers Movers were initially traced using the stable contact address details collected at the previous waves where these were available At the end of Wave 3 fieldwork there were 166 households who had moved and could not be traced For 54 of these households schools data were used to source new addresses allowing them to be issued at Wave 4 Ethnic boost sample at Wave 4 The Wave 4 sample frame also included an ethnic minority boost of six hundred Black African and Black Caribbean young people This sample was selected from schools who did not co operate in the initial Wave 1 sampling frame In total 11 801 households took part in Wave 4 2 8 Sampling at Wave 5 From a total of 11 801 households from Wave 4 11 793 households were issued at Wave 5 Forty two cases had either refused or opted out in between Waves 4 and 5 counterbalanced by 34 households who had previously refused to take part at either Wave 2 3 or 4 but asked to be re included and were issued at Wave 5 In field tracing of movers Movers were initially traced using the stable contact addr
63. he commands in both SPSS and STATA needed to specify the sample design Box 3 Syntax to specify the sample design This example is based on Wave 1 variables SPSS COMMAND CSPLAN ANALYSIS PLAN FILE enter file path and file name here csaplan PLANVARS ANALYSISWEIGHT W1FinWt PRINT PLAN DESIGN STRATA SampStratum CLUSTER SampPSU ESTIMATOR TYPE WR STATA COMMAND svyset pweight w1fintwt psu SampPSU strata SampStratum 76 7 DATA LINKAGE The LSYPE data has been linked to administrative data held on the National Pupil Database NPD The NPD is a pupil level database which matches pupil and school characteristic data to pupil level attainment We have also linked to school level indicators such as school size proportion of pupils gaining 5 or more GCSEs at grades A C and truancy rates and to geographical indicators such as the Index of Multiple Deprivation IMD the Income Deprivation Affecting Children Index IDACI and urban rural indicators More detail on the three types of administrative data files currently linked to LSYPE can be found below They can all be linked to the deposited survey data using surveyid National Pupil Database The majority of pupils sampled from maintained schools have been linked to their NPD records This file includes data on pupils attainment at Key Stage 2 Key Stage 3 and Key Stage 4 and data about the pupil such as free school meal eligibility and Special Education N
64. he main parent interview was conducted by the mother at Wave 1 and by the father at Wave 2 If the respondent changed at any wave they will take the value of 1 and this value will be assigned at the household level Individuals who have been identified as switching their positions across waves Switch 1 will be set to 91 on these variables but the remaining household members will take a value of 1 or 0 Therefore if a household includes a respondent who switched positions it is possible that Main nt will be set to 1 for the remaining household members within that household but for the individual who had switched positions Mainint will be set to 91 7 7 The Wave 4 variable differs slightly from this At Wave 4 a boost sample was introduced if a household member is a boost case they are assigned a value of 1 and if a new member of a previously sampled household has entered the Household Grid at Wave 4 they are assigned a value of 2 8 See Appendix D Section B for details on how this variable was derived 94 Where there is interest in identifying households where the respondent who completed the main second or history interview had changed across waves cases where mainint 1 should be selected A combination of the variables W1mainres W2mainres W3mainres and W4mainres can then be used to identify the person number of the respondent who completed each interview 10 3 Merging datasets with the Index file To create longitudinal dat
65. he telephone fieldwork were sent another reminder letter and email where possible reminding them that they had got half way through the web 27 survey and asking them to finish it A reminder email was also sent out to respondents who had not yet completed a web interview The telephone element of fieldwork started two weeks after the web element 2 June 2010 All respondents who had not completed a web interview or were not pre allocated to the face to face stage but did have a telephone number entered the telephone stage Any respondents who entered the face to face stage were sent a second advance letter letting them know an interviewer would be calling round soon to conduct an interview In line with the previous three waves specific instructions were provided to interviewers to deal with cases where the young person was not currently living in the parental home These rules see Section 4 4 were again applied at Wave 7 The Wave 7 interview consisted of two modules e Household Information Module The young person answered questions about their household situation and gave details of any persons living with them e Young Person Module This was completed by the sample member The questionnaire comprising both sections was designed to take 25 minutes to complete In line with Wave 5 and Wave 6 the mix of data collection methods used at Wave 7 required the design of the questionnaire to be carefully considere
66. here it is distinguish those who possible to enter a non valid response such as answered W5lfUni having entering a space and then to be routed as if a valid given a valid response from response had been given those who were incorrectly routed W5EdExSubYP0a As above there is some inconsistency where the None to young person has entered an invalid response but W5EdExSubYP0d_ the routing has taken them through to the follow up question WSCITYSubYP0a to W5CITYSubYP0d WSCITYSubYP0a to W5SCITYSubYPO0d W5mselfYP For self employed people None W5mmanysYP W5mselfYP WS5mmanysyYP and Wo5fselfY P W5fmanysYP don t always correlate but WO5fselfYP the data collection allowed this W5fmanysYP Parental Users should note that all information relating to None employment parental employment is provided by the young questions person at Wave 5 SIC and SOC With the introduction of a self completion element to None coding the survey SIC and SOC coding became slightly more difficult Interviewers are experienced in collecting the correct information that is needed for this type of coding and will probe where necessary therefore responses to the web survey were more difficult to code than the others and there may end 82 Variable section Problem identified Action taken up being a higher proportion of uncodable responses for this mode at Wave 5 Variables with low file numbers that have been combined with othe
67. his file 5 8 Monthly Main Activity deposited data In autumn 2011 DfE deposited an additional data file relating to the activities of 11 821 young person respondents recorded at Waves 4 5 6 and 7 This information is considered to be the primary source of information for establishing a young person s main activity at any point in time Whilst it is also possible to obtain information on activities current at the time of interview from the Young Person file at each wave this approach is not recommended for any analysis involving the comparison of activities due to the interview 37 period being spread across six months At every wave these six months always crossed the start and end of two consecutive academic years and the traditional summer vacation period which is known to often be a time of activity transition Analysis of activities should therefore where possible focus on activities conducted in a single month as indicated by the Monthly Main Activity file The Monthly Main Activity file takes responses to the Activity History section of the questionnaire at each wave and synthesises this information into variables that represent a monthly time series running from September 2006 two months after the respondents completed compulsory education until May 2010 the first month of interviews for Wave 7 The dataset has one row per respondent making it easier to use than the Activity History files which are one row pe
68. iaaa aa tire AANA ETE nt 64 6 3 1 Achieved sample i224 glee Mw ae eee 64 6 3 2 Weighting maintained school pupils 0 eee eeeeeeeeeeeteeeeeenteeeeeeaas 64 6 3 3 Weighting independent school pupils cccceeeeeeceeteeeeeeeetees 66 6 3 4 Combining maintained and non maintained weights 08 66 6 4 Wave AWE IQINS ircre enra a bone leche nase ates 67 6 4 1 Main cohort Wave 4 ccccccceceeeeeceeececeeeeeeeeeeceeeaeeeeeeeeeseseeesinaeeeeeess 68 6 4 2 Boost cohort Wave 4 eccceeeeeeeeeeceeeeeeenneeeeeenieeeeetneeeeeenaeeeeenea 69 6 4 3 Combining main and DOOSt ce ceeeeeee cence ee teteeeeetieeeeeeeneeeeeeea 69 6 5 Wave 5 weights oie eta aaien dlee even EEE AR OAN 70 6 6 Wave weights sive disc ccececteceshdescsevey teas cas ven aces a peeves dunkade itear u inesddetee 72 Ol SMV AVG WEINS ciae aat kedana a ea tentimcapsacaieatiatys 74 6 8 Weights to use in analysis cccccccceccsecccenneeeeeeeeeeseeeeeeeeeeeeeenseennenees 75 6 9 Specifying the sample design c ccccceeecieeeseeeeeeveesseeeeeeeeeeeeeeneneeeee 76 et ID ATA LINKAGE EE A E 77 8 APPENDIX A QUESTIONNAIRE AND DATA PROBLEMG 79 9 APPENDIX B UPDATES TO WAVE 1 AND WAVE 2 cccccccccsssseeeeees 86 10 APPENDIX C INDEX FILE 10 0 1 iccccnssssscnnnsisssinnnnssnnnnsiseeinsensennnnssacasntasnn 87 10 1 How to use the Index file 200 cece eceeeeeeneeeneeeneeeeeeeceeeeeneeessesseeesseeees
69. id and other datasets The young person is fixed as person one in all households and can be identified by selecting on the variable 15 At Waves 5 6 and 7 it is possible that a small number of households will have members duplicated in the grid This happened where the young person had moved out of a family home at W4 and then moved back into the previous household at Wave 5 or where a person moved out prior to Wave 5 and moved back in Data collection methods did not allow any cross referencing with people who were not listed as being present in the household at Wave 4 therefore in these cases the household members were recorded as being new to the household at Wave 5 As there is less identifying information such as date of birth ethnicity for household members at Wave 5 no assumptions have been made regarding previous household positions The variable W5shgint ensures that a household member will 52 W1HHID 1 or W2HHID 1 W3HHID 1 etc depending on LSYPE wave In Waves 1 2 3 and 4 both the main and second parent can take any position within the Household Grid The main parent can be identified using the variable W1MAINRES or W2MAINRES W3MAINRES W4MAINRES depending on LSYPE wave The second parent can be identified using the variable W1SECORES or W2SECORES W3SECORES W4SECORES depending on LSYPE wave From Wave 5 onwards main and second parents are no longer identified so these variables are not present A value of 98 used on
70. ile that would be expected in the survey population The weights used to represent the probability of being in the sample at Wave 4 are the final weights from Wave 3 There were also a small number of respondents who did not take part at Wave 3 but who did take part at Wave 1 or Wave 21 These respondents were given their weight from the most recent wave they completed and this was multiplied by the mean final weight from Wave 3 The weight accounting for the probability of being in the sample at Wave 4 was initially applied Non response weights were then created using logistic regression to determine a probability of taking part using variables consistent with those investigated in the preceding three waves The two weights were multiplied together to give an overall weight for the main cohort This weight was applied and then the data was rim weighted to the profile shown in Table 18 19 These were respondents who had either moved and been relocated for the Wave 4 fieldwork or requested to rejoin the survey 68 Table 18 Target Proportions for weighting at Wave 4 Proportion 50 59 49 41 School type Maintained 93 57 Independent 6 43 GOR North East 5 34 North West 14 86 Yorkshire and The Humber 10 64 East Midlands 8 91 West Midlands 11 43 East of England 11 04 London 12 48 South East 15 48 South West 9 82 Ethnicity White other Not known 88 86 Bangladeshi 0 98 Pakistani 2 35 Indian 2 29 Afric
71. ion Problem identified Action taken W2OwgherMP W2OwherMP W2Ben4bMP W2Ben4AQbMP WeMibelp Me These variables have been renamed to ensure consistency W2Mhelp1 MPOa W2scomad2HS acraceailwaves W2scomadi2HS W2OutschYP W2OutschnYP W2YouBulYP0a W2youbulnYP0a W2Y ouBulYPOb W2youbulnYPOb W2YouBulYPOc W2youbulnYPOc W2YouBulYP0d W2youbulnYPOd W2YouBulYP0e W2youbulnYP0e W2SCwhopreYP0a W2dwhopreYP0a W2SCwhopreYPOb W2dwhopreYPOb W2SCwhopreYP0c W2dwhopreYPOc W2SCwhopreYP0d W2dwhopreYP0d W2SCwhopreYP0e W2dwhopreYP0e W2SCwhopreY POf W2dwhopreY POf W2SCwhopreYP0g W2dwhopreYP0g W2SCwhopreYPOh W2dwhopreYPOh W2carehrs2YP W2carehr2YP 86 10 APPENDIX C INDEX FILE The LSYPE Index file is not deposited in the archive and is only available to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk The Index file is a longitudinal file containing information of all household members including the young person collected at Waves 1 to 7 The sample consists of the sample member known as the Young Person who was present and interviewed at Wave 1 and any members of their household who were present in the household at Wave 1 or any s
72. iously Additionally in line with Wave 5 only the sample young person was interviewed there were no parental interviews All web respondents received an advance letter and incentive via the post and those with an email address 74 were sent an advance email Contact via email allowed easier access to the survey because the young people were able to click directly onto the hyperlink included in the email Where any incomplete web surveys occurred and a telephone number was held for the respondent then a reminder call was given during the later stages of fieldwork Any partial web cases that had still not completed a survey after the end of the telephone fieldwork were sent another reminder letter and email where possible reminding them that they had got half way through the web survey and asking them to finish it A reminder email was also sent out to respondents who had not yet completed a web interview 25 The telephone element of fieldwork started two weeks after the web element All respondents who had not completed a web interview who were not pre allocated to the face to face stage and had a telephone number entered the telephone stage Any respondents who entered the face to face stage were sent a second advance letter letting them know an interviewer would be calling round to conduct an interview In line with Wave 4 and Wave 5 specific instructions were provided to interviewers to deal with cases where the young p
73. is variable is at individual level and records who within the household was the mother natural adoptive step or foster of the young person at Wave x 10 2 17 WxFather This variable is at individual level and records who within the household was the father natural adoptive step or foster of the young person at Wave x 10 2 18 WxMPMother WxMPFather These variables are at individual level and identify which parent answered the Main Parent section at each wave There can only be one main parent interviewee at each wave therefore if this is the mother then the variable WxMPFather will be set to 91 and vice versa If the household member is not the main or second parent then they will be set to 95 within these variables Table 30 provides an example of how this information may look for a typical family Table 30 Example data showing how to identify the main parent HHID W1HHID ReltoYP Sex W1MOTHER Wi1FATHER W1iMPMOTHER W1SPFATHER 1 1 91 1 91 91 95 95 2 2 8 2 1 91 1 91 3 3 8 1 91 1 91 1 Only the person who is the mother or the father is identified as such all other household members are set to a missing value Person 1 is the young person and as such is set to 95 Person 2 is the main parent mother therefore this variable is set to 91 for Person 3 the second parent father 10 2 19 WxSPMother WxSPFather These variables are at individual level and identify which pa
74. logistic regression model was used to estimate the probabilities of each individual to respond to the survey Data were initially weighted by the Wave 2 weight scaled to equal the achieved sample size Various variables were used as potential explanatory variables Some were variables obtained from the sample frame ethnicity GOR etc Others were socio economic variables obtained from the answers to the Wave 2 questionnaire such as single parent status current working status income support status etc Additionally some were other answers to the Wave 2 questionnaire including use of cannabis pupil s plans to leave school A forward stepwise regression procedure was used to determine which variables should be included in the final model with the final statistically significant terms being e Whether YP ever tried Cannabis e Use of a home computer e Ethnicity e Payment for extra tuition non school subjects e Free School Meals e Whether the YP does homework e Job Seekers Allowance e Qualifications e Whether still at same school e Frequency of the MP s communication with the YP s teachers e Whether YP currently thought to have special educational needs e Suspension e Tenure A logistic regression model containing the variables given above plus the additional variables admission status and deprivation status was used to estimate response probabilities 65 Partially productive respondents Maintained schools A logi
75. longitudinal education gsi gov uk 35 Information on the activities completed by the young person respondent since the prior interview is alternatively available from the LSYPE Monthly Main Activity file for Waves 4 to 7 See Section 5 8 for more information on this file 5 6 Wave 6 deposited data One data file has been deposited for Wave 6 based on the information collected from the young person This has been deposited as a cross sectional data file e Wave Six LSYPE Young Person file October 2010 Two further files for Wave 6 are available but not deposited e Wave Six LSYPE Household Grid file October 2010 e Wave Six LSYPE Activity History file The Household Grid is a hierarchical file containing one row for each individual identified in the household The file contains a total of 49 838 cases representing the 9 799 households who participated in the survey The Activity History file is also hierarchical containing one row for each activity completed by the respondent since the prior interview The LSYPE Household Grid and Activity History files are only available to approved researchers who make a request to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk Information on the activities completed by the young person respondent since the prior interview
76. mail was also sent out to respondents who had not yet completed a web interview 3 With the exception of the small number of young people who were living independently at Wave 4 23 The telephone element of fieldwork started two weeks after the web element All respondents who had not completed a web interview who were not pre allocated to the face to face stage and had a telephone number entered the telephone stage Any respondents who entered the face to face stage were sent a second advance letter letting them know an interviewer would be calling round to conduct an interview As at Wave 4 specific instructions were provided to interviewers to deal with cases where the young person was not currently living in the parental home These rules see Section 4 4 were again applied at Wave 5 The Wave 5 interview consisted of two modules e Household Information Module The young person answered questions about their household situation and gave details of any persons living with them e Young Person Module This was completed by the sample member The full questionnaire comprising both modules was designed to take 25 minutes to complete The mix of data collection methods used at Wave 5 required the design of the questionnaire to be carefully considered so that as far as possible any mode effects were mitigated In order to ensure that the online and face to face versions matched the telephone interview nearly all pr
77. nd Wave 4 e How likely they were to apply to university to do a degree in Wave 4 Their propensity to respond was inverted to give a non response weight which was multiplied by their final weight from the previous wave in which they took part The final weighted data for Wave 6 was compared to the weighted profiles for the previous waves to ensure that the key demographic variables were still in line with previous surveys 73 6 7 Wave 7 weights As with the previous wave Wave 7 weighting involved 2 stages Firstly the design weights were applied which accounted for the probability of being in the sample For this wave the final weights from the Wave 6 were used as the design weights With these weights applied the profile of the issued cases was compared with that of the achieved cases with regards to a range of variables from the previous wave A logistic regression was carried out to estimate the probability of response among key groups that were associated with non response or those considered to be of importance to be controlled for between the waves These looked at the respondents situation in the previous wave and how the sample compared on a range of measures The final criteria that were controlled for were e Sex e Ethnic group with White Other Not known combined e Tenure at Wave 6 e Survey mode at Wave 6 internet telephone or face to face e Interview month at Wave 6 e Higher Educ
78. nges in the way that the study was conducted compared to previous waves The first change to the design of the study was the move to using a mixed mode data collection approach In previous waves all interviews had been conducted in the young person s home using Computer Assisted Personal Interviewing CAPI From Wave 5 the young person could complete the interview either online over the telephone or face to face with an interviewer in their own home as they had done previously The other significant change to the study design to note is that Wave 5 was the first wave which only involved interviewing the sampled young person At previous waves at least one of the sampled young person s parents guardians were also interviewed All web respondents received an advance letter and incentive via the post and those with an email address 58 were sent an advance email Contact via email allowed easier access to the survey because the young people were able to click directly onto the hyperlink included in the email Where any incomplete web surveys occurred and a telephone number was held for the respondent then a reminder call was given during the later stages of fieldwork Any partial web cases that had still not completed a survey after the end of the telephone fieldwork were sent another reminder letter and email where possible reminding them that they had got half way through the web survey and asking them to finish it A reminder e
79. nse in W2 only 4 Response in W3 only 5 Response in W4 only 6 Response in W1 amp W2 only 7 Response in W2 amp W3 only 8 Response in W1 amp W3 only 9 Response in W1 amp W4 only 10 Response in W2 amp W4 only 11 Response in W3 amp W4 only 12 Response in W1 W2 amp W3 only 13 Response in W1 W2 amp W4 only 14 Response in W1 W3 amp W4 only 15 Response in W2 W3 amp W4 only exe 102 Crown copyright 2011 You may re use this information not including logos free of charge in any format or medium under the terms of the Open Government Licence To view this licence visit http Awww nationalarchives gov uk doc open government licence or write to the Information Policy Team The National Archives Kew London TW9 4DU or e mail psi nationalarchives gsi gov uk Any enquiries regarding this document publication should be sent to us at team longitudinal education gsi gov uk Department for Education
80. ntained sector and 646 of them took part Weighting the maintained sample consisted of three steps First weights were calculated for school non response then pupil non response was modelled within responding schools finally calibration weights were calculated School non response Schools were measured on various explanatory variables These included the proportion of pupils from non White ethnic groups the proportion with 5 or more GCSEs at grades A to C the deprivation status of the school a binary 56 variable based on the proportion of pupils entitled to free school meals and regional information Logistic regression models were fitted for school non response using all of these variables The statistically significant terms were the school s deprivation status and its region The school non response weights were then calculated by cell weighting These are shown in Table 11 Table 11 Response by deprivation status of school and region No responders Deprived stratum in London Deprived stratum outside London Not deprived stratum in London Not deprived stratum outside London Pupil non response The responding schools yielded 20 447 students Most of these about 97 could be matched to National Pupil Database records and thus there was good information on for example their gender ethnicity GCSE performance Government Office Region GOR and free school meals entitlement There was very little useful informa
81. of the pupil s selection probability scaled so that the weighted and unweighted achieved sample sizes were equal 6 3 1 Achieved sample There were a total of 13 539 productive or partially productive interviews in Wave 2 456 from independent schools and 13 083 from maintained schools These resulted in 12 439 productive Wave 3 interviews 428 from independent schools and 12 009 from maintained schools In addition this includes two pupils who took part in Wave 1 but were non responders in Wave 2 and were re interviewed Pupils non response weights were calculated as the reciprocal of their estimated response probabilities Following the same method used in Wave 2 these response probabilities were calculated separately for independent and maintained school pupils 6 3 2 Weighting maintained school pupils The vast majority of Wave 2 responders provided both Young Person YP and Main Parent MP interviews This provided a large amount of data to use in non response weighting There was considerably less information on the small number of partially productive Wave 2 respondents As there was also a considerable difference in the response rates between those who were fully productive and partially productive with the fully productive respondents 64 having a much higher response rate the Wave 2 weighting method of using different models for the two groups was repeated Fully productive respondents Maintained schools A
82. old data used to signify cases missing some household level information from the respondent These missing values have been applied to the majority of derived variables where necessary but some derived variables may have required additional missing categories These are fully documented in the derived variable documentation for each wave 5 12 Variable names The LSYPE Wave 1 data originally used the variable names which directly corresponded with the questionnaire However since the original data was deposited the variable names have been renamed This is to enable users to clearly distinguish between the different waves of data for both cross sectional and longitudinal analysis enable users to clearly distinguish between the different modules of the interview completed by the young person the main parent or the second parent Each variable name in the data has been revised to include a prefix to identify the wave of the survey followed by the variable name which directly relates to the questionnaire and then a suffix to identify who the question relates to i e 41 MP SP or YP Multicoded variables will also include an alphabetical suffix two characters and this will always be found at the end of the variable name The only time this procedure is not followed is when it has been necessary to create derived variables on the dataset An example of this is where a flag variable is derived when o
83. ompted list based questions used an active format where respondents had to give a response to each item For online and face to face interviewing list based questions usually use a passive format where a list is shown to a respondent and they only identify the items that apply to them This approach is generally adopted as it significantly lessens respondent burden but comparability with the telephone interviews meant that the active approach had to be used As a 24 result of the need to use an active format questions with prompted lists were kept to a minimum and where they had to be used they included as few items as possible This approach was felt to be too unwieldy for the activity history since the last wave and this was the one item where the different modes adopted a different approach Three different paper based versions of the questionnaire were created one for the web survey one for the telephone survey and one for the face to face survey All three versions were cross checked to maintain consistency This allowed virtually all variables to be constructed in the same format to allow for combined data analysis 4 6 Fieldwork at Wave 6 Wave 6 fieldwork ran from 12 May 2009 to 14 October 2009 This wave of the study followed the approach carried out in Wave 5 where the young person could complete the interview either online over the telephone or face to face with an interviewer in their own home as they had done prev
84. onal linking and was already held for additional attainment information beyond Key Stage 4 as of November 2011 these data are not yet available 78 8 APPENDIX A QUESTIONNAIRE AND DATA PROBLEMS The following tables describe some of the problems identified during Waves 2 to 7 and also includes a column to identify what action was taken in relation to the data Table 22 Variable section W2Modap3a Problems with Wave 2 questionnaire Problem identified Question not asked if respondent did not say yes at Modap2 Action taken Two variables have been created one for 1 BMRB data only and one for 2 All company data but filtered for Modap2 yes W2Disc1a Question missing from the script Variable in dataset only includes data collected by BMRB W2NumAlev There was a problem in the creation of the sample Data from respondents who did amp variable that meant that most respondents were not pass the filter has been left in W2NumGCSE asked these questions the dataset W2Hrefper In a small number of cases c 25 the Hrefper Data left in for both questions were answered by both parents respondents Table 23 Problems with Wave 3 questionnaire Variable Problem identified Action taken section W3Plann16YP NOP and BMRB used different filters therefore small These variables are left in the number of respondents from NOP issued sample did dataset and a missing value not answer this
85. open ended questions such as longstanding illnesses these are available as categorical variables only As LSYPE has progressed both legislation and guidance relating to the disclosure of personal information have changed This means that on some occasions variables that were previously suitable for disclosure early in the study were no longer able to be disclosed at later waves Applications for permission to use variables not deposited as part of the main dataset can be made to DfE using the Confidentiality Agreement form available with LSYPE documentation on the UK Data Archive http www esds ac uk findingData snDescription asp sn 5545 doc or from team longitudinal education gsi gov uk DfE will consider applications on a case by case basis 5 15 1 LSYPE Family Background file The content of the Family Background file is summarised in Table 3 indicating the level of information available at each wave 46 Table 3 Summary of content of LSYPE Family Background file Questionnaire Section and Content Wave 1 Wave2 Wave3 Wave4 Wave5 Household section Languages spokeninthehome e e e Family activites ew o S S T Household responsibilities e e J e J Household resources dE e e e e Individual parent section Demographics e e J e f Qualifications and education oe e Current activity e Second adult current activity Oooo Health e Employment activity
86. oung Person File sav By surveyid EXECUTE Save OUTFILE C TEMP Wave One amp Two LSYPE Young Person File with survey response sav This dataset will now represent a longitudinal dataset of the young person answers from Wave 1 and Wave 2 along with the survey level response variable This syntax can be altered to also incorporate data from other waves Example 2 Creating a main parent only file where the respondent remained the same person across Wave 1 and Wave 2 To create a longitudinal dataset that only incorporates answers given by the main parent i e rather than working from a young person specific file the following syntax provides an example of how to merge these together GET FILE C Longitudinal Index File sav Sort cases by surveyid A wihhid A Select if wimpmother 1 and w2mpmother 1 and mainint 0 SAVE OUTFILE C Longitudinal Index File mothers only sav get file C TEMP Longitudinal Index File sav Sort cases by surveyid A wihhid A Select if wimpmother 1 and w2mpmother 1 and mainint 0 save outfile C TEMP Longitudinal Index File mothers only sav GET FILE C TEMP Wave One LSYPE Family Background File sav Sort cases by surveyid A SAVE OUTFILE C TEMP Wave One LSYPE Family Background File sav GET FILE C TEMP Wave Two LSYPE Family Background File sav Sort cases by surveyid A SAVE OUTFILE C TEMP Wave Two LSYPE Family Background Fil
87. ps 2 if wihhgrid gt 0 and w2hhgrid gt 0 and w3hhgrid gt 0 and w4hhgrid gt 0 resps 1 if wihhgrid gt 0 and w2hhgrid lt 0 and w3hhgrid lt 0 and w4hhgrid lt 0 resps 2 if w1hhgrid 992 w1hhgrid 971 and w3hhgrid 992 w3hhgrid 97 1 and w4hhgrid 97 1 and w2hhgrid gt 0 resps 3 if w1hhgrid 993 w1hhgrid 972 and w2hhgrid 993 w2hhgrid 972 and w4hhgrid 972 and w3hhgrid gt 0 resps 4 if wihhgrid lt 0 and w2hhgrid lt 0 and w3hhgrid lt 0O and w4hhgrid gt 0 resps 5 if wihhgrid gt 0 and w2hhgrid gt 0 and w3hhgrid lt 0 and w4hhgrid lt 0 resps 6 if wihhgrid lt 0 and w2hhgrid gt 0 and w3hhgrid gt 0 and w4hhgrid lt 0 resps 7 if wihhgrid gt 0 and w2hhgrid lt 0O and w3hhgrid gt 0 and w4hhgrid lt 0 resps 8s if wihhgrid gt 0 and w2hhgrid lt 0 and w3hhgrid lt 0 and w4hhgrid gt 0 resps 9 if wihhgrid lt 0 and w2hhgrid gt 0 and w3hhgrid lt 0 and w4hhgrid gt 0 resps 10 if wihhgrid lt 0 and w2hhgrid lt 0 and w3hhgrid gt 0 and w4hhgrid gt 0 resps 11 if wihhgrid gt 0 and w2hhgrid gt 0 and w3hhgrid gt 0 and w4hhgrid lt 0 resps 12 if wihhgrid gt 0 and w2hhgrid gt 0 and w3hhgrid lt 0 and w4hhgrid gt 0 resps 13 if wihhgrid gt 0 and w2hhgrid lt 0 and w3hhgrid gt 0 and w4hhgrid gt 0 resps 14 if wihhgrid lt 0 and w2hhgrid gt 0 and w3hhgrid gt 0 and w4hhgrid gt 0 resps 15 variable labels resps DV Longitudinal household level response value labels resps 1 Response in all waves 2 Response in W1 only 3 Respo
88. question has been added to identify the W3RetedYP NOP and BMRB used different filters therefore small problematic cases on the number of respondents from NOP issued sample who variable W3Plann16YP Further should have answered the W3Plann16 variable to this a new variable has been actually answered this variable derived W3Plan16YP that incorporates both W3Plann16YP and W3RetedYP W3Exclude This variable was actually used for text fill purposes This variable has been removed rather than to measure whether a young person had from the datasets been excluded from school W3AwareEMA 2 cases had not answered w3ApplyEMA when they Due to the routing of these amp should have been asked questions it was apparent that W3ApplyEMA these cases suggested they were not aware of EMA in W3AwareEMA therefore have been set to No in W3ApplyEMA Table 24 Variable section W4NEETStatYP0a to W4NEETStatYP0g W4NEET12YP Problems with Wave 4 questionnaire Problem identified There was a mistake in the filter for the 2 NEET questions in the Word questionnaire The filter in the Word questionnaire was IF NEET MainAct2 1 OR 2 OR 4 AND HARMCHECK lt gt 2 AND ExamChk 2 However the filter should have been IF NEET MainAct2 1 OR 2 OR 4 AND HARMCHECK lt gt 1 AND ExamChk 2 The script used by BMRB MORI used the correct filter throughout fieldwork but the NOP script used the filter from the Word questionnaire
89. r activity Whilst the Activity History files provide up to fourteen different activity categories information on the types of activities has been summarised into four categories for the Monthly Main Activity file These categories closely match those activities of most interest to the policies of the Department for Education The four activity categories provided are the following e Education e Employment e Apprenticeship Training e Unemployed Inactive NEET The activities listed are recorded in the file through 45 finacf variables which represent the 45 months of data that are available Each finac variable takes a value that represents one of the four categories or states that there is Insufficient information Where the latter occurs this indicates respondents where the time series has ended prematurely This is usually due to sample attrition or because the respondent was either not able to or refused to recall activities 38 A small amount of editing of information from the Activity History files has been completed in the derivation of main activities to improve the usefulness of the data and aid interpretation For example some activities which were missing start end dates have had these dates imputed Care has been taken to only impute dates in a window appropriate to the activities that precede and follow those where information is missing also taking into account the dates on which the respondent w
90. r categories and removed from the W5AnyconYPOg Responses of young people who mentioned mental W5AnyconYPOg not in health problems have been combined with dataset W5AnyconYPOf YP Potential problems at university Health problems including mental health disability W5AnyconYPOm Responses of young people who mentioned drug W5AnyconYPOm not in problems have been combined with W5AnyconYP0z dataset YP Potential problems at university Other answers W5NEETProbYPOf Responses of young people who mentioned YP W5NEETProbYPOf not in Barriers to becoming EET Have my own children dataset pregnant have been combined with W5NEETProbYP0a YP Barriers to becoming EET Caring responsibilities W5NEETProbYPOi Responses of young people who mentioned YP W5NEETProbYP9Oi not in Barriers to becoming EET Mental health problems dataset have been combined into W5NEETProbYPo0h YP Barriers to becoming EET Disability health problems W5HEsub5YP Young people who were coded to DV Subject area W5HEsub5YP not in dataset of degree would like to study Agriculture amp related subjects have been combined with W5HEsub4YP DV Subject area of degree would like to study Veterinary science 83 Table 26 Issues with Wave 6 data Variable section Problem identified Action taken w6BenftsYPOb Two young people who are not employed have None responded to this question
91. re collected in some subsequent waves collected for all members at Wave 2 and new household members at Waves 3 and 4 and age was calculated using a combination of the date of birth and the Wave 1 interview date 20 In the dataset x should be replaced with the numerical value representing the wave of interest 21 See Box 5 for examples of how to match the Index File to the available cross sectional datasets 89 10 2 4 Hdobm and Hdoby These variables identify the month and year of birth of all young persons collected at the Wave 1 interview Details of a household members date of birth were only collected from Wave 2 and updated at Wave 3 and Wave 4 for any new respondents These variables will therefore not include information for household members other than the young person who only responded at Wave 1 or who joined the household after Wave 4 10 2 5 Sex This variable is fixed across waves and refers to the sex of the individual identified within the household This variable is fixed even for respondents who had swapped positions within the household as the Index file uses the variable HHID to form the longitudinal structure The variable uses the Wave 1 data as a starting point and is updated accordingly with any new household members identified in subsequent waves 10 2 6 ReltoYP This variable identifies the relationship of each household member to the Young Person at Wave 1 or at the first wave in which they joined Whil
92. relevant interview section has been conducted These variables are created using a variable based on who else was present during the specific interview section These variables are only able to ascertain that the interview section was conducted and they do not attempt to ascertain what level of response was achieved These examples are based on the Wave 1 data and can be recreated for subsequent waves Variable 5 relates to Wave 3 only as there was no specific SP interview at this wave Variable 6 relates to Wave 4 to highlight that the interview may have been completed by either second parent the main parent or both These variables are not applicable for Waves 5 6 and 7 as there were no parental interviews Box 5 Syntax for Index file derived variables 1 to 6 1 missing values w1whopresmp0a compute W1MPint 1 if w1whopresmp0a 99 W1MPint 0 variable labels W1MPint W1 Main Parent Interview Section conducted value labels W1MPint 0 Main Parent interview not conducted 1 Main Parent interview conducted execute freq W1MPint w1whopresmp0a 2 missing values w1ppressp0a compute W1SPint 1 if w1ppressp0a 99 W1SPint 0 if w1ppressp0a 98 W1SPint 91 variable labels W1SPint W1 Second Parent Interview Section conducted value labels W1SPint 91 Not applicable second parent not present in HH 0 Second Parent interview not conducted 1 Second Parent interview conducted execute missing values W1
93. rent answered the Second Parent section at each wave There can only be one second parent interviewee at each wave therefore if this respondent is the mother then the variable WxSPFather will be set to 91 and vice versa If the household member is not the main or second parent then they will be set to 95 within these variables 93 10 2 20 W2Newmember W3Newmember W4Newmember This variable is at the individual level and indicates whether the household member was a new member at Waves 2 3 or 4 It is possible that some household members were not identified as a new member at Wave x but were not apparent in a previous wave and these have been assigned a value of 2 on WxNewmember 2 10 2 21 Nouse and W6nouse This is a flag variable that has been created to indicate cases where a problem has been identified in the Household Grid information obtained in Waves 1 and 2 These are cases where the information is not consistent between waves and additional checks have not been able to untangle the information collected If Nouse 1 then users are advised not to use these cases for any longitudinal comparisons A further household that has inconsistencies longitudinally has additionally been identified in the variable W6nouse 10 2 22 Mainint Secint Histint These variables are derived to identify whether the main second history respondent was the same person in Waves 1 to 4 or whether this respondent changed for example if t
94. required in order to complete robust analysis This more recent weight is required to compensate for the demographic structure of the cohort changing over time as not all Wave 1 respondents remained in the study until Wave 6 In some LSYPE datasets some specialist weights are additionally included to take account of particular situations One example of this was the introduction of the sample boost at Wave 4 where in some cases additional weights are provided both with and without the boost cohort Choice of weight in this situation should be made depending on whether variables from Waves 1 2 or 3 are being used as the boost cohort will not have variables from these three waves available If such variables are being used then the boost cases should be excluded if the option is available There is an additional scenario with extra weights to cope with the introduction of young people who skipped particular waves of the study As these skippers will not have responses to questions from the wave they missed a weight with these skippers removed should be used if looking across waves If analysis is only focused on a single wave then skippers can be included without any loss of accuracy 75 6 9 Specifying the sample design For more robust analysis such as standard errors it is preferential to specify the sample design SPSS requires an additional module called complex samples to specify the sample design Box 3 provides t
95. responses There was some variability in response probability depending on the sex of the student and the type of school single sex or mixed To account for these differences logistic regression models were used to establish which variables could be used to weight the data From this the only useful variables were the students sex and their type of school Cell weighting was then used to derive pupil non response weights The weights obtained are shown in Table 8 55 Table 8 Response by type of school and sex of young person Category No responders Boys in boys schools 120 Boys in mixed schools 156 Girls in girls schools 161 Girls in mixed schools 92 One respondent s sex was unknown so they were given an average weight The initial design weights were trimmed and combined with the pupil non response weights Calibration weights were finally applied so that the achieved sample size matched the population breakdown by type of school single sex or mixed and by region London not London The population figures are given in Tables 9 and 10 Table 9 Population proportions by type of school Category Population proportion Boys in boys schools Boys in mixed schools Girls in girls schools Girls in mixed schools Table 10 Population proportions by region Category Population proportion London 18 9 Rest of England 81 1 6 1 2 Weighting maintained school pupils 838 schools were selected in the mai
96. rs just before commencing their assignment Letters were sent to all the young people who participated in Wave 1 and were selected for Wave 2 A second letter was also sent to the parents guardians of the young people which included a fact sheet that gave specific details regarding why the parents were being contacted and the purpose of the second wave Both letters advised that the interviewer would be calling at their address following receipt of the letter The young person s letter also included an unconditional 5 gift voucher incentive In total the interview consisted of five modules The sample member completed one module the young person interview which lasted approximately 35 minutes Adult interviews were also completed for household information the main parent interview second parent interview and child history These lasted approximately 35 minutes altogether However there was considerable variation in the length of adult interviews depending on whether the adult was interviewed at Wave 1 or not Interviews with adults not interviewed at Wave 1 took longer as interviewers had to collect some of the data missed at Wave 1 The total target interview time was 1 hour and 10 minutes As at Wave 1 the main parent was defined as the parent most involved in the young person s education whilst the second parent was defined as an adult other than the main parent who had a parental relationship to the young person i e a natural
97. rttrtrreernet 13 3 2 Response rates at Wave 2 cccccccccccccccccceccceeceeeeeeeeeeeeeeeeeeeeseeeeeaeenes 13 3 3 Response rates at Wave 3 cccccccccccecccceeceeeceeeeeeeenaeeeeeeeeeeeeeeseaeeaas 13 3 4 Response rates at Wave 4 cccccccccccccecccceeceeeeeeeeeeeeeeeeeneeeeeeseeeneaeenas 13 3 5 Response rates at Wave 5 ccccccccccccceccceeeceeeceeeeeeeeeeeeeeeeeeeeeaeeseaeenas 14 3 6 Response rates at Wave 6 cccccccccccccccceecceeeeeeeeeeeeaeeeeeeeeeeeeeeeeaeenas 14 3 7 Response rates at Wave 7 ccccccccccecececeeeceeeceeeeeeeeeaeeeeeeeeeeeeeeeeaeenas 14 A gt FIELDWORK iiini oia itanna aa aale aaa aaa aiana aa aaae a aaa ainka iaaa 16 4 1 Fieldwork at Wave 1 cccceccseeceeeeceeeeeeeeeeeaaeeeeeeaeeeaeeeaeeeaeeeseeeseeeseeee 16 4 2 Fieldwork at Wave 2 0 cccccesseeceeececeeceeeeeeeeeeeaeeeeeeeeaeeaaeenseesseeeseeee 17 4 3 Fieldwork at Wave 3 i i c0 iys cies geaestnauecesteansbyeatvta golden ten aun a Leucine 18 4 4 Fieldwork at Wave 4 00 0 0 ccccecsceeseeneceeeeeeeeeeaeeeaeeeeeeeeeeeseeaeesseesseesseeees 19 4 5 Fieldwork at Wave 5 cccccecseeseeeeeeeeeeeeceeeeeeeeeeeeeeeeeaeaeaseeeeseeeseeeseeee 23 4 6 Fieldwork at Wave 6 0 c ccesseeceeeeeeeeeeneeeeeeeeeeeeeeeeaeaeaeeeaeeeeeeeseeeseeee 25 4 7 Fieldwork at Wave 7 ceeceeeseececeeeeeeeeeeeeeeeaeeeeeaeaeaeeeedeeeseesseeees 27 4 8 Questionnaire MOdUIES cesseeseeeseeeseees
98. s for the None person s relationship to the young person than in previous waves This is because of the nature of the mixed modes data collection where it may have been sensitive to allow the respondent to see more detailed categories 81 Variable section Problem identified Action taken W5anyconb In the questionnaire this question referred None retained responses as W5anycon specifically to financial problems that the young given by the young person person might encounter if they decided to go to Anyone analysing these university However many of the backcoded other variables may want to responses refer to problems that are not related to consider using both sets of financial issues responses Conversely many young people gave responses relating to financial problems when answering W5anycon which asked about problems other than costs and finance W5BenftsYP0a to Backcoding of other responses in the Benfts None it was not appropriate W5BenftsYPOi variables means that the numbers of people to code these cases to script answering the follow up benefits questions do not error since the young always match up person had responded other meaning that correct routing had been followed woifuni 128 cases who didn t give a valid response in A flag variable W5lfuniFlag Unisubb were then routed to answer W5lfUni The has been created to majority were completing the web survey w
99. s relating to young people It has been used to monitor the progress of the cohort group evaluate the success of policies aimed at this group and provide an evidence base for further policy development The study has brought together responses to seven annual household interviews with data from administrative sources and reflects the variety of influences on learning and progression The annual interviews obtained information from the young person and additional information from a main and second parent interview The information collected provides data about e the young person s family background 1 At Waves 5 6 and 7 parent s are no longer interviewed parent s socio economic status personal characteristics attitudes experiences and behaviours attainment in education parental employment income and family environment as well as local deprivation the school s the young person attends attended the young person s post 16 plans and activities LSYPE known to respondents as Next Steps was commissioned by the former Department for Education and Skills DfES in 2004 The study has been subsequently managed by the predecessor departments of DfES the Department for Children Schools and Families DCSF 2007 to 2010 and the Department for Education DfE 2010 to 2011 1 2 Objectives The main objectives of LSYPE have been to gather evidence about the transitions young people make from secondary
100. s script error Fundstud early script error which was resolved on the later releases EverDDA NOP filter was placed within a previous filter Coded as script error Exten2 NOP filter for this question was placed in the wrong Coded as script error position which meant that the question could not be asked Table 25 Issues with Wave 5 data Variable section Problem identified Action taken W5Shgint BMRB Gfk NOP edited the responses for a small The variable W5ShgintFlag number of cases where the young person had refused to say whether a household member was present in the household This mainly applies to the young person s parent This has led to a number of discrepancies when comparing those who appear to have a parent present in the household with the cases who have responded to the Parent section of the interview indicates the cases whose responses have been edited Household Grid A small number of young people completing the online survey were swerved around the household section as it was felt that the interviewer notes about household members would not be understood by the respondent Coded to 999 in the relevant variables W5gender and Where a person is no longer living in the household None W5relation there is inconsistency regarding whether their Household Grid sex gender has been retained in the grid or has been file coded to not applicable W5relation At Wave 5 there are fewer categorie
101. s were selected to account for the probability of being in the sample At Wave 6 these were the final weights from Wave 5 With these weights applied the profile of the issued cases was then compared to that of the achieved cases with regards to a range of variables from earlier waves There were two sections to the weighting The first part was among those who took part in the study at Wave 5 and the other was among those who did not Wave 5 skippers 72 For those who took part in the previous wave CHAID was used to identify groups who had statistically significant differing response rates These included combinations of e Number of different contact details provided at previous wave e Whether given permission to link answers with Department for Work and Pensions data e Month of interview in previous wave e Tenure in previous wave e Mode of completion in previous wave e Whether going to school or college at previous wave and level of qualification being studied e How well teachers expected them to do in their Year 11 or earlier exams e Whether applied for higher education in previous year e Likelihood of voting in next general election For those who skipped the Wave 5 survey logistic regression looked at response propensities and those that were found to be significantly related to response likelinood were e Number of different contact details provided by respondent at Wave 4 e Whether still living at same address in Wave 3 a
102. se NEET 69 77 78 01 For the boost cohort there were too few people to consider logistic regression therefore cell weighting was also used The variable used for the cell weighting on this occasion was whether or not the sampled young person was studying for A Levels at Wave 4 The response rates within the groups are shown in Table 20 Table 20 Response rates used to calculate Wave 5 boost cohort weights Study at Wave 4 Response rate Not studying A Levels 56 21 Studying A Levels 72 86 To obtain the final non response weights the inverse of the response rate was taken for each subgroup and multiplied by the design weight to achieve 71 the final non response weight A final weight was then applied using rim weighting in order to match the profile of the respondents to that of the population The target profiles are shown in Table 21 Table 21 Target proportions for weighting at Wave 5 Categories Proportion Sex Male 50 59 Female 49 41 GOR North East 5 34 North West 14 86 Yorkshire and The Humber 10 64 East Midlands 8 91 West Midlands 11 43 East of England 11 04 London 12 48 South East 15 48 South West 9 82 Ethnic Group White other Not known 88 86 Bangladeshi 0 98 Pakistani 2 35 Indian 2 29 African 1 82 Caribbean 1 45 Mixed 2 26 6 6 Wave 6 weights Weights to account for non response from certain groups between Waves 5 and 6 were calculated in two stages Firstly the design weight
103. sked of Main or Second parent The variable W4sourceSP indicates who answered these questions 12 Full details of all derived variables are available in the LSYPE Derived Variable Documentation which has been deposited on a Wave specific level 43 MP Age first left education SP Age first left education If a variable is asked of the main or second parent but relates to the young person this is also clearly defined in the labels for example MP Why YP no longer lives with natural parents At Wave 5 a was included in variable labels to indicate that the question or response categories differ to those from previous waves even though the variable name remains the same longitudinally For example DV Employment status of mother 5 14 Data cleaning Each wave of the LSYPE data has gone through an extensive process of checks to ensure the consistency and validity of the data These are checks that investigate any outliers found within the data ensure that the data has followed the routing used in the questionnaire ensure that the correct person has answered the relevant questions and ensure that information is consistent between directly comparable variables During the process of checking the data it was necessary to edit some responses and to create missing value categories to identify particular issues such as item non response For example the Household Grid collects the relationships of
104. st it is possible for relationships to change over time due to their subjective nature this is not captured within the Index file Any users interested in identifying changes in relationships should refer to the wave specific Household Grids available At Waves 5 6 and 7 less detailed information was collected about the household member s relationship to the Young Person so anyone joining the household after Wave 4 is coded to 989 in this variable 10 2 7 ReltoYP2 This variable contains similar information to ReltoYP but is based on the reduced coding scheme for this question at Waves 5 6 and 7 For those who were in the household prior to Wave 4 ReltoYP has been recoded into the Itis possible that there are slight discrepancies between reported age and the Hdobm and Hdoby variables This will mainly be were the day of birth results in a household member being a year younger or older when compared to the date of interview 90 reduced categories For those who have joined the household after Wave 4 the information is taken directly from the interview response 10 2 8 WxMainres This variable can be used to identify the position of the main parent within the household and corresponds to WxHhid This variable is at the household level therefore all members of a household will have the same value unless they were not present at a specific wave 10 2 9 WxSecores This variable shows the position of the second parent if applicable
105. stic regression model was used to estimate the probabilities of each individual to respond to the survey Data were again weighted by the Wave 2 weight scaled to equal the achieved sample size Fewer variables were available for non response modelling either because they hadn t been collected in Wave 2 or because sample sizes were too small to allow their use The same logistic regression model used in the previous wave was used with White ethnic group Level 2 educational achievement and sex included 6 3 3 Weighting independent school pupils As with the maintained pupils a logistic regression model was used to estimate the probabilities of each individual to respond to the survey Once again data were weighted by the Wave 2 weight scaled to equal the achieved sample size A forward regression approach identified whether the Main Parent has a pre 1975 O level and a variable about the YP being happy at school as being statistically significant In addition an interaction term between sex and type of school was forced into the model 6 3 4 Combining maintained and non maintained weights The final stage was to create a file containing the estimated response probabilities The reciprocals of these probabilities gave the un scaled non response weights The non response weights were scaled to have a mean of 1 The weights had low variability mainly because of the high response rate The weights for independent school pupils wer
106. t Household member Sex of person Relationship to young person Marital status of person Whether person living with someone in the household as a couple Employment status of person Ethnic group of person Position of Main parent Position of Second parent Position of mother Position of father Position of History respondent Position of HHgrid respondent At Wave 4 the History respondent is the Main Parent Indicates that this information was asked at Waves 5 6 and 7 all other information is not applicable or was not asked in the Household Grid in these waves The Household Grid contains one record for each person who has ever appeared in the household for each family that participated at Wave 1 The individual details pertaining to the young person the main parent and the second parent such as age sex marital status and family composition are available on the cross sectional files Where applicable these variables have been derived from the Household Grid and are included in the Family Background file and the Parental Attitudes file Any analyses relating either to the demographical information of other household members or to relationships between other household members must be done using the LSYPE Household Grid files The LSYPE Household Grid file also includes individual level identifiers for the young person main parent and the second parent that identify their position within the Household Gr
107. taken 61 Table 15 Origin of variable Sampling Frame YP Wave 1 response MP Wave 1 response VEIE GOR Ethnicity YP s qualifications YP s sex School s admission status School s derivation status YP s plans for education after the age of 16 YP having a computer at home Whether a single parent Current working status Whether MP claimed JSA Whether MP has an A level Partially productive respondents It was harder to find a useful model for non response of the partially Variables identified for logistic regression module Comments on Wave 2 response African pupils were least likely to respond Those with level 2 qualifications were most likely to respond Those planning on leaving education at the age of 16 were less likely to respond Those with a home computer were more likely to respond Those from single parent families were less likely to respond Pupils whose MP claimed JSA were less likely to respond Pupils whose MP had an A level were more likely to respond productive Wave 1 respondents This was because their Wave 1 interviews contained less useful information and also because of the small size of the dataset A final logistic regression model included three explanatory variables ethnicity made into a binary variable White or not White qualifications whether or not they had obtained Level 2 and sex 6 2 2 Modelling response from independent school p
108. teed untast secehne tenet A T 44 5 15 Datasets irea a tion eududalon Aa ten a 45 5 15 1 LSYPE Family Background file ceeeeeeeeeeeeeeeeeeeteneeeeeenneeeeeeee 46 5 15 2 LSYPE Parental Attitudes file 0 2 eee eeeeeeeeeeteeeeetenieeeeetneeeeeeee 47 5 15 39 LSYPE Young Person file inchs akira tiated 48 5 15 4 LSYPE Activity History file and Monthly Main Activity file 50 5 15 LSYPE Household Grid Iana a a AEE 51 5 15 6 LSYPE History file Waves 1 and 2 ececceecceeeeeeeeeeeeetneeeeeeee 53 CMe A Ci it c pie eee te ne ere 55 6 1 Wave T Weights aereoa Pieces wets tad ranea A A AEAEE EEY 55 6 1 1 Weighting independent school pupils cccceceeeeeeeteeeeeettees 55 6 1 2 Weighting maintained school pupils eee eeeeeeeeeeeteeeeeeneeeeeeaes 56 6 1 3 Combining maintained and independent school weights 59 6 1 4 Etfects of the weighting sie ccsceeceened teeta cbieeatisteeeeaeoteeeendmieeenet tects 59 6 2 WV AVE 2 weights stn con cater Toate see thak Goeat Gt nail aa ha EA E aaaea 59 6 2 1 Modelling response from maintained school pupils 008 60 6 2 2 Modelling response from independent school pupils 05 62 6 2 3 Creating the weights cccceeeeeeeeteeeeeeeeneeeeetteeeeetieeeeeetneeeeeeas 63 6 2 4 Effects of the non response Weighting eeeeeeseeeeeeeenteeeeeeaee 63 63 gt NV AVE weights meoir eeaeee iea
109. tion Please note that at Wave 4 it was possible for the main parent to answer the second parent questions Due to this the variable W4SPINT includes some additional categories which indicate who completed the Second Parent interview 10 2 13 NW3SPint As discussed in Section 4 3 the Second Parent interview was not collected at Wave 3 however it was possible for a second parent to be present during the Main Parent interview This variable was specifically derived to identify households where interviews were conducted jointly with both the main and the second parent or with just the main parentError Bookmark not defined This variable uses a slightly different naming convention to highlight the difference between this and the variables described in Section 10 2 11 10 2 14 WxYPint This variable shows whether the Young Person interview was conducted at Wave xError Bookmark not defined The variable is at the household level and as such all members of a household will have the same value unless they were not present at the wave in question 10 2 15 WkxHistint Waves 1 2 and 4 only This variable shows whether the History Section was conducted at Wave xError Bookmark not defined and is reported at a household level All members of a household will therefore have the same value unless a household member was not present at a specific wave 6 The History Section did not form part of the Wave 3 interview 92 10 2 16 WxMother Th
110. tion on those without a National Pupil Database match 3 Those with a match to the National Pupil Database were processed separately to those without Those without a match were given a mean weight whilst for those with a match logistic regression models were used to estimate the probabilities of response The final logistic regression model included the terms GOR ethnicity qualifications and an interaction term between GOR and White ethnic group The weight was then calculated as the reciprocal of the response probability Calibration Design weights were next combined with school non response and pupil non response weights to calculate combined weights which were calibrated to the 16 Nine categories White Bangladeshi Pakistani Indian African Caribbean Mixed Other Not Obtained Refused 17 Three categories Achieved Level 2 5 GCSEs or equivalent at A to C Achieved Level 1 5 GCSEs or equivalent at A to G but not Level 2 did not achieve Level 1 57 population proportions given in Table 12 These proportions are sourced from the National Pupil Database Table 12 Proportion of young people by demographical breakdown Category Proportion Ethnicity White 83 0 Bangladesh 1 0 Pakistan 2 3 Indian 2 3 African 1 8 Caribbean 1 4 Mixed 2 2 Other 2 5 Not obtained 3 5 GOR North East 5 3 North West 14 8 Yorkshire and The Humber 10 6 East Midlands 8 9 West Midlands 11 4
111. tities However when the calculated weights are very variable the weighting process will increase the random error in the estimates thus reducing their precision The effect the weights have on precision can be measured by their efficiency or by the design effect essentially the reciprocal of the efficiency as shown in Table 16 Table 16 Design effects Stage of weighting on Wave 2 data Design effect due to weights Using the Wave 1 weight W7Finwt 1 263 Using the Wave 2 weight W2Finwt 1 278 A rough interpretation is that the design has an efficiency of 1 1 278 78 3 relative to an equal probability sample taken with the same amount of stratification and clustering The effects of the non response weighting can be summarised by comparing the final two rows of this table The design effect obtained using W2Finwt is only slightly greater than that obtained using W1Finwt This means that the non response weighting is associated with a high level of efficiency This is because the high response rate and the fact 63 that response rates were quite similar among the main sub groups led to very little variability in the non response weights 6 3 Wave 3 weights This section explains how the data was weighted to account for the non response between Waves 2 and 3 A design weight was provided by the fieldwork consortium this variable is called designweight and is available in the Wave 1 dataset This is the reciprocal
112. to face interviews In line with Wave 5 only the sampled young person completed the interview at Wave 6 3 7 Response rates at Wave 7 Of the 9 791 young people issued at Wave 7 the survey reached 8 682 households 90 This was made up of 3 965 40 online interviews 3 942 40 telephone interviews and 1 715 18 face to face interviews In line 4 For a full explanation of the mixed modes used for data collection at Wave 5 please see Section 4 5 5 A further 165 2 face to face interviews were issued but not resolved 14 with Waves 5 and 6 only the sampled young person completed the interview at Wave 7 4 FIELDWORK Fieldwork for the first four waves was carried out by a consortium of BMRB GFK NOP and Ipsos Mori Data was collected via face to face interviewing using computer assisted personal interviewing CAPI At Waves 5 6 and 7 fieldwork was carried out by BMRB and GFK NOP only and a mixed mode approach was used see Section 4 5 Data have been supplemented by linkage to administrative records such as the National Pupil Database and other data sources such as geo demographic data from the 2001 census Validation of the data collected and enhancement of the study was undertaken by NatCen for Wave 1 to Wave 6 and by DfE for Wave 7 4 1 Fieldwork at Wave 1 A two stage process was completed for fieldwork at Wave 1 Firstly an advance letter was sent to all head teachers at the sampled schools introducing the study
113. ty Employment Activity History for NEW ENTRANTS and respondents not interviewed at Wave one Second Parent Employment Activity History since LAST INTERVIEW Current Activity Employment Activity History for NEW ENTRANTS and respondents not interviewed at Wave one History Section Birth and Health Relationship History 54 6 WEIGHTING This section explains the development of weights for LSYPE data which were created to ensure any resulting analysis can account for the survey design for each wave Section 6 1 discusses the preliminary process of deriving the Wave 1 weights This weighting procedure was twofold with pupils from maintained schools and those from non maintained schools weighted separately Weights for subsequent waves are discussed in sections 6 2 to 6 7 The correct method for establishing which weight to use for analysis is identified in Section 6 8 and details of how to specify the sample design using SPSS are provided in Section 6 9 6 1 Wave 1 weights In the first instance a design weight was provided by the fieldwork consortium this variable is called designweight The value of this is the reciprocal of the pupil s selection probability scaled so that the weighted and unweighted achieved sample sizes were equal 6 1 1 Weighting independent school pupils Fifty four schools from outside the maintained sector were sampled and 28 of these took part in the study These 28 schools yielded 530
114. ubsequent wave This file is deposited as a hierarchical file containing one row for each individual who has ever appeared in the household The Wave 1 data forms the basis of the file and has been updated with the information from the subsequent waves Therefore if a household member moved out of the household at any wave their details are still held in this file If a new member has entered the household the Index file is updated to include their information This file represents all the 15 770 young people who participated in the study at Wave 1 plus 352 Wave 4 boost cases and also represents all members of the young person s household identified at any wave The Index file contains individual identifiers such as the person number of the respondents at all waves and fixed characteristics such as age sex and relationship to the young person The variables included in this file are described in more detail in Section 10 2 The individual details pertaining to the young person the main parent and the second parent such as marital status and family composition are available on the cross sectional files Waves 1 to 4 These variables have been 87 derived from the Household Grids and are included on the family background files and the parental attitudes files corresponding to each wave 10 1 How to use the Index file The Index file includes individual level identifiers for the young person main parent and the second parent
115. upils The Wave 1 data set included 530 pupils from independent schools A forward stepwise logistic regression procedure including several variables as potential explanatory variables was used to model whether or not a pupil responded The variables included in the final logistic regression model were whether the school was in London whether the MP had a pre 1975 O level the YP s and MP s attitude to school and school work the YP s sex and type of school boys girls or mixed 62 In general a positive evaluation of school measured by whether the YP strongly agreed with the statement School work is worth doing and whether the MP was very satisfied with the YP s progress at school was associated with a high probability of response Those whose parents had a pre 1975 O level also had a higher probability of response Pupils from London schools had a lower response rate 6 2 3 Creating the weights The reciprocals of the estimated response probabilities gave the unscaled non response weights The top and bottom 1 were trimmed and then scaled to have a mean of 1 These non response weights called DesignweightSCALED range from 0 90 to 1 53 SD 0 11236 The Wave 2 weight W2Finwt was calculated by multiplying W1Finwt by WitoW2NrWt and scaling to ensure they had a mean equal to 1 6 2 4 Effects of the non response weighting The purpose of the non response weighting is to eliminate bias in the estimates of population quan
116. utliers have been edited for inclusion in a derived variable Normally the raw data is left unedited and the change is made during the derivation of new variables For example in order to derive particular income variables such as gross annual salary it was necessary to check outliers and clean the data based on the assumptions made during these checks The flag variable is therefore provided for those interested in specifically looking at the edited data and compare with the unedited data These variables do not use a wave prefix but start with the word flag A typical variable name is made up of the following characters Prefix1 Question name Suffix1 A multicoded variable will use the following characters Prefix1 Question name Suffix1 Suffix2 Prefix1 Indicates the wave W1 wave 1 W2 wave 2 etc Question name is directly comparable with the questionnaire It is easy to search for questions within your dataset as long as you use the relevant wave prefix in front of the question variable name Suffix1 Indicates who the question was asked of YP the Young Person MP the Main Parent and SP the Second Parent Suffix2 Indicates a multicoded variable which can range from Oa answer 1 to for example aw answer 49 1 Itis highly unlikely that a multicoded variable will use more than 70 categories but this suffix system would allow for approximately 700 categories as the two characters of the suffix en
117. where applicable which identify their position within the Household Grid The young person is always fixed as person one in all households and can be identified by selecting on the variable HHID 1 Both the main and second parent can take any position within the index file but can be identified for each wave using the variables WxMAINRES for the main parent and WxSECORES for the second parent This only applies to Waves 1 to 4 since from Wave 5 main and second parents were no longer identified in the interview Similarly the history respondent can take any position within the Index file and can be identified for each wave using the variables WxHISTRES Users who wish to create person specific datasets i e with mother only responses or with second parent only responses can use a combination of the variables discussed in Sections 8 2 16 to 8 2 19 The six variables WxHHRESP indicate whether there is information for an individual household member at a particular wave Survey level information is also available relating to Wave 1 to 4 providing users with an overall indication of response for each respondent i e main second parent young person and history respondent These variables are discussed in more detail in Section 10 2 22 onwards and will help users interested in longitudinal analysis of the data 10 2 Variable descriptions The first two letters of each variable contained within this dataset indicates the wave of the surve
118. which was then followed up by contact from BRMB interviewers to collect contact information for the sampled pupils Secondly advance letters were sent to both the parents guardians and young people at the selected addresses Both letters introduced the survey explained that an interviewer would be calling at their address and why and also advised that all young people who participated would be given a 5 gift voucher Wave 1 fieldwork ran from 30 March 2004 to 18 October 2004 In total the interview consisted of five modules The sample member completed one module the young person interview which lasted approximately 35 minutes Adult interviews were also completed for household information the main parent interview second parent interview and child history These lasted approximately 55 minutes altogether The total interview time was 1 hour and 30 minutes All young people who completed an interview were given a 5 high street voucher In 14 cases at Wave 1 no second parent was identified in the household interview but a second parent interview was subsequently conducted None of these households responded at subsequent waves 16 4 2 Fieldwork at Wave 2 Wave 2 fieldwork ran from the 18 April 2005 to 18 September 2005 Wherever possible interviewers were assigned to the same households they had interviewed at Wave 1 All interviewers were briefed via face to face briefings Advance letters were sent by interviewe
119. would be returning to the parental home during the fieldwork period e g whilst on leave and to arrange an interview for then If this was not the case they were asked to collect the contact details of the armed forces 20 accommodation and pass these back to head office who would collate and try to arrange access If the young person had moved to college other educational residential accommodation or employer residential accommodation interviewers were again asked to try to find out if the young person would be returning to the parental home during the fieldwork period and to arrange an interview for then If this was the not the case they were asked to try to obtain contact details for the new accommodation and attempt to follow them up there If the young person was now in prison or a young offenders institute interviewers were asked to collect contact details of the institution and pass back to head office who would decide on the most appropriate way to follow up these cases The interview consisted of four modules Pre Survey Module This was asked first and established Whether the contact address was where the young person usually lived Whether the young person was living with their parents guardians The type of accommodation the young person was living in Who should complete the Household Grid Household Information Module Young Person Module This was completed by the sample member and took approximately
120. y for the activity history since the last wave and this was the one item where the different modes adopted a different approach Three different paper based versions of the questionnaire were created one for the web survey one for the telephone survey and one for the face to face survey All three versions were cross checked to maintain consistency This allowed virtually all variables to be constructed in the same format to allow for combined data analysis 4 7 Fieldwork at Wave 7 Wave 7 fieldwork ran from 18th May 2010 to 12th October 2010 This wave of the study followed the approach carried out in Waves 5 and 6 where the young person could complete the interview either online over the telephone or face to face with an interviewer in their own home as they had done previously Additionally in line with the previous two waves only the sampled young person was interviewed there were no parental interviews All web respondents received an advance letter and incentive via the post and those with an email address 87 were sent an advance email Contact via email allowed easier access to the survey because the young people were able to click directly onto the hyperlink included in the email Where any incomplete web surveys occurred and a telephone number was held for the respondent then a reminder call was given during the later stages of fieldwork Any partial web cases that had still not completed a survey after the end of t
121. y section but it was reinstated at Wave 4 for to collect information from boost sample respondents At Waves 5 6 and 7 only the young person was interviewed with them answering questions about their parents guardians and their household Table 2 provides a description of the general content within each questionnaire section at all seven waves Appendix A provides details of any problems highlighted within the questionnaire after fieldwork Questions were asked about parents guardians in Wave 5 only 29 Table 2 Summary of questionnaire content Waves 1 to 7 Wave Wave Wave Wave Wave Wave Wave 1 2 3 Household section answered by YP at Waves 5 6 and 7 Household situation Household Grid Languages spoken in the home e e o o ooj Main parent section Waves 1 2 3 and 4 only Attitudes to the young person s school and involvement in education Extra curricular classes Year 10 subject choices School history Special educational needs Parental expectations and aspirations Family activities Household responsibilities Household resources Young person history Future contact details Relationship with young person and contact ai with services Reasons for not living with natural parents poe oe foe Te Risk factors absences truancy police contact Oo do ee i e E Year 11 experiences CT T P stTeplans 2s ase cee Individual parent section Waves 1 2 3 and 4 only Demographics CT e e Jef _
122. y the variable refers to for example W1 refers to Wave 1 88 information and W4 refers to Wave 4 information with the exception of the variable NW3SPINT described in detail in Section 10 2 13 Missing values within the Index file follow the same definitions as those described in Box 2 in Section 5 11 For ease of reference the cross sectional variables are described below and are referred to as Wx within this user guide where applicable Details of how the derived variables have been constructed are provided in Appendix D 10 2 1 surveyID This is a unique household level identifier that can be used to merge this data and other deposited LSYPE datasets to each other 10 2 2 HHID This variable has been created as a fixed person number across all waves This was necessary as some household members were found to have swapped positions within the Household Grid between waves although the majority of household members including the main and second parent respondents remain in the same position across all waves Households that had swapped positions have been amended on the cross sectional files and all will correspond with this variable 10 2 3 Age At Wave 1 each household member was asked their age and this variable records their answers This variable is fixed based on the Wave 1 data but has been updated were possible with information collected at subsequent waves if details were missing at Wave 1 Dates of birth we
Download Pdf Manuals
Related Search
Related Contents
Senseo Senseo HD7884 ドライバーを守る、真実を見つめる目 どこでも快適に ÿþ! Y ! ? ! E ˝ L ? n 0 1 安全のために必ずお守りください Sony NC-AA MDR-RF950 User's Manual 338 User Manual - France(FR) R2 User Manual EKI-6340 Series 19inch.jp Here Copyright © All rights reserved.
Failed to retrieve file