Annex Five 

The Data Collection and Validation Exercise

The Data Collection Exercise
Data collection for the 2007 Aid Effectiveness Report was preceded by a consultation exercise to confirm the format of the questionnaire and the structure of the database, a dialogue that took place mainly through the Partnership and Harmonisation TWG in the latter part of 2006. Training was then provided to all development partner data focal points during a series of half-day sessions at CDC. These training sessions considered both procedural and technical aspects of the reporting exercise and included both on-line data input and off-line up-dating.

A training manual was developed and training was completed by mid-January 2007, at which time all focal points were asked to provide data by mid-February. This date for submission of data proved to be highly ambitious, however, and CRDB staff continued to provide support until mid-March. At this point a request for final validation was sent to development partners so that data analysis could commence by end-March, although some partners continued to validate their data until mid-April, requiring that the analysis be revised and the Report redrafted. It must be noted that development partner focal points, and their CRDB counterparts, demonstrated significant dedication and commitment to entering and cleaning data and this effort is gratefully acknowledged.

The main lessons of the data collection exercise could be summarised as follows:

  • There is a very limited understanding in most development partner offices of many terms and definitions, in particular related to the Paris Declaration and technical cooperation

Not Such Good Practices?

Partner Data Systems

A number of issues constrain the ability of development partners to disclose information.

These include limited awareness of aid effectiveness principles (considered fundamental to aligning aid to the NSDP), restricted access to project documents (including budgets), and a common need to revert to capitals for basic information.

Decentralization and improved information management will be essential if development partners are to maximize the effectiveness of their support to the NSDP.

  • Where Development Partners need to refer to their capitals or regional offices, data is often very difficult or impossible to obtain, even on matters as routine as project budgets 

  • Efforts by CRDB in its capacity as the Government aid coordination focal point - to obtain project documents so that some independent verification could be undertaken proved very to be difficult, suggesting that development partners themselves do not have ready access to this information

  • Many development partners reported significant levels of 'database fatigue' having been asked to present the same data to different parts of Government in recent months

  • Problems experienced by development partners in reporting on their work casts some doubt on their ability to meet their Paris Declaration and H-A-R  commitments on reporting on how their funds are used

Emerging Good Practices

Access to Global Fund data

The 2007 data collection exercise obtained information from the Global Fund for Aids, TB and Malaria for the first time.

This important provider of assistance maintains an excellent website that provides immediate access to financial data and complete information on all projects.

This allowed CRDB staff to efficiently record and verify data the Global Fund is the only partner who's data matches exactly that which is reported to the DAC.

  • Data on trust funds and regional programmes are difficult to obtain and therefore most likely undercounted. As regional initiatives increase, systems must be developed to record and monitor these flows

  • A streamlined process of data collection and validation will not only make the process more robust and efficient but could also have significant benefits in linking the management of aid flows to the Budget/MTEF and PIP exercises

  • In the future, projects should be routinely entered into the Database as they are developed (pipeline) and approved (on-going), precluding the need for an intensive annual exercise

  • More routine and pro-active support from CRDB counterparts, perhaps on a quarterly basis, will improve the quality of the data and promote the 'managing for results' capacity of both Government and its development partners.

  • Despite the laudable efforts of all focal points, the data collection exercise provides a useful reminder to all parties that development cooperation is often a 'messy business' and it is not always possible to characterise or capture the complex nature of support in a database.

One important conclusion related to the data collection exercise is that very few partners appear to have information systems in place that permit ready access to information on the projects that they finance. To promote more effective aid management in the context of the NSDP, it would be useful to work in partnership to identify the features of such a system and to make use of Cambodia's participation in the OECD/DAC Working Party on Aid Effectiveness to ensure the cooperation of donor capitals or regional offices, who's support is often required in reporting on routine project activity.

A second conclusion is that much more work needs to be done by both Government and development partners if the Paris Declaration and the whole aid effectiveness agenda is to be applied. This lack of awareness in many development partner offices goes some way to explaining the paradox of many development partners being vocally committed to the H-A-R Action Plan at a senior level while the reality of the practices employed in their programmes and projects is perhaps somewhat different.

The final conclusion relates to the management of data and information systems across Government. In the context of on-going reforms and associated sector/thematic work, it will be important to simplify and harmonise the collection and sharing of data. Multiple data collection exercises are not only inefficient but they can also lead to conflicting sets of data being used for programming or reporting purposes. Harmonising both data collection processes and the calendar for collecting information is one potential option to be explored, including for the PIP and Budget exercises.

The Questionnaire

It is necessary to consider in more detail some of the problems experienced in completing the questionnaire so that these issues can be addressed in future rounds, either by revising the questionnaire or providing more training. Some parts to the questionnaire were found to be particularly prone to error and misunderstanding, or else the data simply was not available.

This applies in particular to the use of technical cooperation, i.e. the distinction between Free-Standing Cooperation and Investment-related Technical Cooperation, as well as the recording of the use of project staff. Additional misunderstanding was common in the recording of government implementers and the association of a project with a Program-based Approach (PBA), with some partners believing that if their project was part of a sector that had established a PBA then their support was automatically associated with it. A further example relates to the recording of Paris Declaration indicators, which was discussed in the previous Section.

Further complications arose as some development partner focal points do not have complete information, especially for NGO implemented projects, and the questionnaires need to be sent to the project/program implementer. This implies that more time, or a more routine data collection exercise, are required, while the Manual must be updated to elaborate on the Glossary of Terms and to provide clearer guidance to the user.

Overall, this experience raises important questions about the Government's ability to exercise full ownership of development assistance when there are such prevalent misconceptions, or a lack of routine data systems for providing information on the provision of strategic resources such as technical cooperation.

Is it ODA?

The CDC Database attempts to present a full picture of all external flows. This includes those flows considered to be Official Development Assistance but also other flows that are intended for the non-official sector (which are technically not defined as ODA) or which are sourced from the non-official sector.

The Database therefore captures a wider range of external flows than just ODA. As explained below, this is one of the reasons why the data in the system is in some case different from that collected by the OECD/DAC and recorded in their Creditor Reporting System (CRS).

How good is our data?

Before policy measures can be prescribed, it is necessary to consider the quality of the data that has been used in the preparation of this Report.

The starting point is to take a macro view that compares data collected in Cambodia to that of the most reputable global source for ODA data, i.e. the OECD/DAC Creditor Reporting System (CRS). Although data for 2006 is not yet available in the CRS, figures from 2005 can be compared to provide a useful insight into the quality of the data.

The charts below show disbursements by Cambodia's main multilateral and bilateral development partners, recorded by both the OECD/DAC and the CDC Database. Key points to note are:

  1. CDC records data not appearing in the DAC database. The DAC data does not record disbursements made by some of the most important providers (in financial terms) of development assistance. This includes the Asian Development Bank (reporting disbursements of USD 89.4m in 2005 to CDC), the World Bank (USD 37.8m), China (USD 46.6m), the Republic of Korea (USD 14.9m) and most UN agencies (only UNICEF and UNAIDS appear in the DAC CRS data).

  2. Where CDC has over-recorded support this may be because a development partner (including NGOs) has provided funds for a non-ODA activity that would not appear in the DAC CRS.

  3. DAC records data not appearing in the CDC Database. Some development partners have reported to the DAC but these partners have been unable to record their support in the CDC Database (e.g. Austria, Ireland, Luxemburg, Norway, Spain) while others (USA, Australia, France, Switzerland, Netherlands) have not been able to report fully, often as significant shares of their support are not disbursed through the local representative office. CDC will continue to work with these partners to support them in entering their data.

  4. Where CDC has possibly under-recorded the figure this may be because disbursements have been made to regional programmes that benefit Cambodia or because disbursements are made for assistance that benefits Cambodia but which is not available for directly funding activities in the country (e.g. scholarship schemes, administration/staff costs).

  5. Other development partners (e.g. OPEC Fund) are known to provide support through other development partners but their original source of funds is not recorded in either Database.

  6. The omission of several development partners from the OECD/DAC dataset implies that the degree of competition, which is presented in the analysis in Chapter Two, is significantly understated and that the scale of the 'coordination challenge' is actually greater than that implied simply by the data that is used to inform the fragmentation analysis.

  7. Only a very few development partners (e.g. the Global Fund, EC, UK, Sweden, Japan and Belgium) are able to demonstrate a consistency in their reporting to both DAC and CDC systems.

Discrepancies in 2005 ODA Disbursement Reporting (CDC data DAC CRS figure)

Absolute differences (USD million) Relative differences (per cent)

Source: CDC and DAC CRS Databases (CDC data is April 2007 and may not be final numbers submitted by partners)

While these numbers compare development partners aggregate disbursements, it should also be noted that total 2005 disbursements to Cambodia recorded by CDC (USD 610 million) are significantly higher than the figure recorded by the DAC (USD 392.3m), even once the DAC non-reporting donors are accounted for, as NGO disbursements from their own sources are recorded in the Database; in 2005 these were estimated to be USD 44.7m.

The comparison of aggregate datasets leads to the conclusion that the CDC database captures significantly more funding than the CRS system and, although this is not without its problems, it means that it is likely that the CDC dataset presents a more complete, and therefore more accurate, picture regarding the availability of external support.

The next step in considering aggregate data quality is to consider data consistency over a longer period of time. To do this, those 15 development partners who report to both CDC and the DAC can be extracted from the data set and analysed separately over an extended time period.

It can be seen in the chart below, that the aggregate disbursement trends move quite closely together, although the within-year discrepancy between the CDC and DAC figures can be as large as USD 50 million. There does not appear to be any systematic relationship in the deviation, however, as in 2002 the CDC Database recorded approximately 20% higher disbursements than the DAC, but this was reversed in 2005. Analysing development partners on a like-for-like basis (comparing the discrepancies for a single partner), the correlation coefficients between their annual disbursements is remarkably high (this is shown in the table to the right of the chart, below). Movements and trends in the respective CDC and DAC datasets on individual partners in individual years are therefore broadly similar, which indicates that the data is of a relatively robust nature even if there are some aggregate discrepancies.

CDC and  DAC Deviations in Recording Aggregate Disbursements (2002-05)

Individual Donor Correlation coefficients









N.B. A  correlation coefficient of 1 implies an exact statistical relationship between two variables, 0 implies no relationship

     DAC CRS Data
     CDC Data

Source: CDC and DAC CRS Databases (showing aggregate disbursement data for Australia, Belgium, Canada, Denmark, EC, France, Germany, GFATM, Japan, Netherlands, New Zealand, Switzerland, Sweden, United Kingdom, United States)

The final consideration with regard to data quality is to move away from an aggregate comparison of disbursements to look more closely at the data reported by each individual development partner. One of the main problems that manifests itself at an aggregate level concerns the number of partners who do not appear to know which sector their support is directed to (USD 42 million, or 6.9% of all disbursements, are categorised as 'other' sector despite multiple sector selections being permitted) or to the correct identification of implementing partner (either Government or NGO).

The data is robust for policy analysis

Given that: (i) the CDC dataset contains more information from more development partners; (ii) across time there is a close correlation between development partner disbursement data in the DAC and CDC datasets; and (iii) the data in the CDC Database, especially the non-financial data, has been cleaned relatively thoroughly, it is possible to conclude that the data used for this Aid Effectiveness Report is of a sufficiently robust nature to inform policy analysis. As the Government and its partners progress on the path toward 'managing for development results' and a more evidence-based aid management, however, additional attention should be paid to improving the quality and coverage of the data.

Measures to improve data collection and quality

There are a number of relatively straightforward practices that can be either strengthened or established to improve the quality of the data and the analysis. It must be emphasised that these practices would not be intended as an end in themselves; they would be directly associated with the effort to improve aid management at an aggregate level with commensurate benefits to NSDP implementation. These potential practices include:

  1. Moving from an annual data collection exercise to a less intensive quarterly exercise with development partner focal points. This would include training, data entry and validation, and analysis;

  2. Consistent with its mandate, as per the Sub-Decree that established CDC, development partners should work more closely with CDC as the aid coordination focal point. At the project formulation/approval stage this would enable increased alignment and coherency while data entry could take place simultaneously so that the PIP and MTEF exercises could be strengthened;

  3. All project documents and agreements should be lodged with CDC to allow for independent validation and a reduced burden on the development partners.

Future Data Collection Exercises
By making data collection an integrated part of the development assistance formulation/agreement process the process not only becomes more closely linked to the management of the NSDP, it also becomes much simpler. This should be the objective for future data collection procedures.

Adopting an 'enter as you go' approach will also reduce the intensity of the annual exercise to report on disbursements, allowing for that period to be used for training, awareness raising and a more strategic dialogue on the manner in which the data can inform the 'managing for results' effort. Efficiencies in data collection might also be pursued within Government as data collection exercises can be combined and more use made of data sharing, for example in producing the Sector Profiles that are presented in Chapter Two.


| Content | Back | Top | Next |

Home | 1st CDCF Meeting | 8th CG Meeting | Partnership and Harmonization TWG | GDCC | Policy Documents Guidelines | Donor Dev. Coop. Pgm. | NGO