View
215
Download
0
Category
Preview:
Citation preview
8/2/2019 Vision and Plan for Data Processing Center at Social Science Research Institutes
1/7
Vision of Data Processing Environment at SocialResearch Institute.
Background
Social research institutes plays vital role in coordination between government and
public being part of formulation and monitoring developmental plan. They also help inunderstanding nature and status of the society belonging to its area of study. Due to its
specific role, it hold data which has different nature from that of industry and
government statistical system. While industry has more emphasis on managerial type
data (for taking decisions on day to day activity) and hence it is mostly of online nature,
social research institutes use data for policy formulation and monitoring purposes (and
hence off line nature). Unlike government statistical organization, they rely many times
on qualitative surveys and experiment with new methodologies, indicators and
subjective nature of data. Unlike industries and government statistical system, they
heavily depend upon data from many sources which are not always synchronized at
scale of time, geographical nature and purpose. Due to their specific data requirement,
social research organization require a particular type of data processing environment.
Due to availability of vast computational power in Information Technology (IT) in last
two decades or so, in turn, impacted significantly on the techniques for designing and
implementing social research (qualitative and quantitative). Parallel to the
developments in hardware, there is significant improvements in the quality and user
friendliness of software for statistical data processing, analysis, and dissemination. This
has also made it possible for many of the processing tasks to move from computer
experts to subject matter specialists. A number of software packages for the processing
of statistical surveys have emerged over the years. The relative strengths for each of
these software products differ with the different steps of data processing. Use of
suitable software, for different steps of data processing, and training have significantrole in plan for realization of vision of modern data processing system.
Vision
Vision of data processing environment for social research institute may be expressed
through following capacities and behaviors:
1. Institute is capable of large scale qualitative and quantitative data analysis.
2. Any data related with qualitative or quantitative research may be released for
analysis with in four months of field work.
3. Data processing may help in monitoring of field work (problem of probing etc.)
through patterns in incoming data.
4. Sufficient computational and analytical skill to adopt full strength of computer
based analysis (see annexure-1).
5. Can easily adopt any new methodological change in data capturing, analysis,
presentation, dissemination and computerized content as well as knowledge
management system.
6. Have rich data bank comprising all relevant data and documents either owned
by institute or collected from others (may bepanel data). It is integrated with
broader network with various level of asses to users of data.
7. Have good links with other institutes and individual users of its study for
sharing data and ideas through social network.
http://en.wikipedia.org/wiki/Social_researchhttp://en.wikipedia.org/wiki/Qualitative_data_analysishttp://en.wikipedia.org/wiki/Quantitative_researchhttp://en.wikipedia.org/wiki/Content_managementhttp://en.wikipedia.org/wiki/Knowledge_managementhttp://en.wikipedia.org/wiki/Panel_datahttp://en.wikipedia.org/wiki/Social_networkhttp://en.wikipedia.org/wiki/Social_researchhttp://en.wikipedia.org/wiki/Qualitative_data_analysishttp://en.wikipedia.org/wiki/Quantitative_researchhttp://en.wikipedia.org/wiki/Content_managementhttp://en.wikipedia.org/wiki/Knowledge_managementhttp://en.wikipedia.org/wiki/Panel_datahttp://en.wikipedia.org/wiki/Social_network8/2/2019 Vision and Plan for Data Processing Center at Social Science Research Institutes
2/7
Organization of Data Processing
To proceed in direction of above vision, modular approach will provide more
adoptability and flexibility to implement plan of realization of vision. Total data
processing environment may be divided in centers which will perform different steps of
data processing. These centers have been created according to different nature of work,
requirements of software (and its training) and skills to perform the task. Developing allcenters simultaneously to perfection level is not essential. They can be developed in
phases.
Although Data Bank is central part of data processing, we can develop data processing
system from periphery. Centers may be given priority as follows:
(1) Data preperation center:
Although data may come in different form (like textual, number, audio, video etc.), we
can concentrate on numeric (quantitative) and textual data obtained as outcome of
quantitative and qualitative survey at initial stage. Data preperation of quantitative an
qualitative surveys are entirely different (and hence different skill and software
required), separate wing may be created for preparing quantitative and qualitative
survey data. Following will be requirement of wings:
Quantitative wing:
Center forAnalysis
Center ofSocial
Network
Disseminationcenter
DataPreperatio
n center
Data Bank
http://en.wikipedia.org/wiki/Data_processinghttp://en.wikipedia.org/wiki/Data_processinghttp://en.wikipedia.org/wiki/Data_processinghttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://en.wikipedia.org/wiki/Data_processinghttp://en.wikipedia.org/wiki/Data_processing8/2/2019 Vision and Plan for Data Processing Center at Social Science Research Institutes
3/7
Hardware: PCs (of moderate strength). Number may vary as per work load
Software: CSPro (free to download).
Responsibility: Data entry, data validation, codification, basic predefined tabulation,
generation of field monitoring reports.
Skill: In charge of center should have understanding of (1) logic associated with
questionnaire (2) steps of data preperation (3) program development through CSPro (4)
basic understanding of database, spreadsheets, data archiving (for in charge). Rest of
staff will work as data entry operator. Basic knowledge of computer (file system) will
be required for them.Link: Questionnaire preperation team, data bank
Qualitative wing:
Hardware: PCs (of moderate strength). Number may vary as per work load.
Software:AtlasTi,Anthopac, Answer(free), ez-text(free)
Responsibility: Entry of field report (or its summary) according to format required for
software, creating codes.
Skill: Understanding of subject, capable to create suitable quotation and code from text.
All faculty and research scholars who are involved in qualitative research should have
skill of running such software.
Link: Team of qualitative research, data bank
(2) Center for analysisAll faculty and research scholars should be attached with analysis center.
Hardware: PCs with sufficient RAM and CPU strength to all faculty. A good lab for
research scholars.
Responsibility: Doing exploratory and confirmatory data analysis, report writing,
preperation of presentation.
Skill: Knowledge of using word processor, spread sheets, slide preperation tools,
statistical software, GIS based modeling and simulation
Software: MS Office, Open Office (free), Epi Info (free) for presentation through map(other open source GIS software may be selected according to level of requirement, see
forother sources), Stata (more suitable for analysis of large complex surveys).
Link: Data bank
(3) Dissemination center
Hardware: PCs of sufficient strength.
Software: Basic knowledge of HTML, CSS, HTML Editor. There are many tools
available which reduce programming load for its user. Druple is one of them which is
freely available. There are many free html editor also available. Most of the content
management tools have its own HTML editor.
Responsibility: Center will receive raw documents in form of soft copy from itsfaculties and will convert them in suitable format for publishing (in hard copy as well as
on web). Unless development of databank, all part of content management- creation,
editing, publishing and managing (archiving) will be responsibility of this center.
Skill: Aesthetic sense of word processing, skill to use content management tools.
Link: Analysis center, data bank
(4) Center of social network
Any social research institute can not work in isolation. Recent developments in IT and
web, has made it possible to use social network for learning and research. There are
many benefits of social networking at individual level as we as organizational level.
Following are benefits at organizational level:
1. Make sure knowledge gets to people who can act on it in time.
2. Connect people and organization to build relationships across boundaries of
geography or discipline.
http://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.atlasti.com/http://www.atlasti.com/http://www.analytictech.com/anthropac/apacdesc.htmhttp://www.cdc.gov/hiv/topics/surveillance/resources/software/answr/index.htmhttp://www.cdc.gov/hiv/topics/surveillance/resources/software/ez-text/index.htmhttp://www.cdc.gov/hiv/topics/surveillance/resources/software/ez-text/index.htmhttp://en.wikipedia.org/wiki/Exploratory_data_analysishttp://en.wikipedia.org/wiki/Statistical_hypothesis_testinghttp://opensourcegis.org/http://freegis.org/http://en.wikipedia.org/wiki/Cascading_Style_Sheetshttp://en.wikipedia.org/wiki/HTML_editorhttp://en.wikipedia.org/wiki/HTML_editorhttp://www.software-pointers.com/en-content-tools.htmlhttp://www.software-pointers.com/en-content-tools.htmlhttp://drupal.org/getting-started/before/overviewhttp://en.wikipedia.org/wiki/Comparison_of_HTML_editorshttp://en.wikipedia.org/wiki/Comparison_of_HTML_editorshttp://www.c4lpt.co.uk/handbook/contents.htmlhttp://www.atlasti.com/http://www.analytictech.com/anthropac/apacdesc.htmhttp://www.cdc.gov/hiv/topics/surveillance/resources/software/answr/index.htmhttp://www.cdc.gov/hiv/topics/surveillance/resources/software/ez-text/index.htmhttp://en.wikipedia.org/wiki/Exploratory_data_analysishttp://en.wikipedia.org/wiki/Statistical_hypothesis_testinghttp://opensourcegis.org/http://freegis.org/http://en.wikipedia.org/wiki/Cascading_Style_Sheetshttp://en.wikipedia.org/wiki/HTML_editorhttp://www.software-pointers.com/en-content-tools.htmlhttp://drupal.org/getting-started/before/overviewhttp://en.wikipedia.org/wiki/Comparison_of_HTML_editorshttp://www.c4lpt.co.uk/handbook/contents.html8/2/2019 Vision and Plan for Data Processing Center at Social Science Research Institutes
4/7
3. Provide an ongoing context for knowledge exchange that can be far more
effective than memoranda.
4. Attune everyone in the institute to each other's needs more people will know
who knows who knows what, and will know it faster.
5. Multiply intellectual capital by the power of social capital, reducing social
friction and encouraging social cohesion.
6. Create an ongoing, shared social space for people who are geographically
dispersed.
7. Amplify innovation when groups get turned on by what they can do online,they go beyond problem-solving and start inventing together.
8. Create a community memory for group deliberation and brainstorming that
stimulates the capture of ideas and facilitates finding information when it is
needed.
9. Improve the way individuals think collectively moving from knowledge-
sharing to collective knowing.
10. Turn training into a continuous process, not divorced from normal business
processes.
Hardware: PC with sufficient bandwidth.
Software: Most of the social software are available as web services and are free.Responsibility: In charge of center will analyze, expand and maintain social network of
institute.
Skills: Although faculties and staff will be member of this center. In charge of center
will maintain communication on behalf of institute at platform of social network.
Link: Faculty and staff, all centers, external people and organization.
(5) Data bankData bankis central part of data processing system. It is the center through which other
center will be coordinated. Apart from own data and report, center will work as
consortium of different academic and research institutes as well as external socioeconomic data banks like Inter University Consortium for Political and Social Research ,The United Nations Statistics Division, Minnesota Population Center, IQSS Dataverse
Network etc.Hardware: Sever and PCs with sufficient bandwidth. Institute can hire web hosting
services for maintaining its external link.
Software: Tools forwebmaster (to be selected by webmaster according to his
confidence. Many open source toolsare available).
Skill: Role of data center is very challenging. Its in-charge should be capable to
configure server, install application at host site and integrating web services. He should
know server and client scripting language (like PHP and Javascript) and Database
management tools.
Responsibility: Following are responsibilities of data bank center
1. Create catalog of data and reports.
2. Put uniform code for geographical area (in different data sets) so that they may
be linked
3. Create different aggregation level of data as per need.
4. Provide data in required format
5. Createmetadata for data collected by institute. It will help to share data.
6. Preparing time series micro- economic data banks
7. Role of webmaster
Links: With all centers and external network.
Challenges in realizing the vision
1. It is difficult to identify a role model. A lot of experimentation are going on at
international level. There is need to be cautious to choose own path by learning
from on going experimentation.
http://en.wikipedia.org/wiki/Social_softwarehttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.icpsr.umich.edu/http://www.icpsr.umich.edu/http://www.icpsr.umich.edu/http://millenniumindicators.un.org/unsd/aboutus.htmhttp://www.ipums.org/http://dvn.iq.harvard.edu/dvn/http://dvn.iq.harvard.edu/dvn/http://en.wikipedia.org/wiki/Web_hosting_servicehttp://en.wikipedia.org/wiki/Web_hosting_servicehttp://en.wikipedia.org/wiki/Web_masterhttp://en.wikipedia.org/wiki/Web_masterhttp://www.opensourcescripts.com/dir/PHP/Web_Hosting_Tools/http://www.opensourcescripts.com/dir/PHP/Web_Hosting_Tools/http://en.wikipedia.org/wiki/Web_serviceshttp://www.im.gov.ab.ca/publications/pdf/MetadataResGuide.pdfhttp://www.im.gov.ab.ca/publications/pdf/MetadataResGuide.pdfhttp://www.nber.org/chapters/c6615.pdfhttp://en.wikipedia.org/wiki/Social_softwarehttp://www.cee-socialscience.net/archive/empirical/dbsr/report1.htmlhttp://www.icpsr.umich.edu/http://millenniumindicators.un.org/unsd/aboutus.htmhttp://www.ipums.org/http://dvn.iq.harvard.edu/dvn/http://dvn.iq.harvard.edu/dvn/http://en.wikipedia.org/wiki/Web_hosting_servicehttp://en.wikipedia.org/wiki/Web_hosting_servicehttp://en.wikipedia.org/wiki/Web_masterhttp://www.opensourcescripts.com/dir/PHP/Web_Hosting_Tools/http://en.wikipedia.org/wiki/Web_serviceshttp://www.im.gov.ab.ca/publications/pdf/MetadataResGuide.pdfhttp://www.nber.org/chapters/c6615.pdf8/2/2019 Vision and Plan for Data Processing Center at Social Science Research Institutes
5/7
2. IT people are trained as per need of business and industry. It may be difficult to
identify suitable people (or trainer) according to need of institute.
3. There may be resistance for change in role of faculties and staff.
4. Old habit may resist for new change. Chances of resistance increases because
gain (through data processing system) can be perceived only after certain level
of perfection.
5. Training is crucial for vision. For successful training, it is necessary to fix
target of achievement at organization and individual level (in terms of work)
after particular training. This is difficult to implement. Trainer also may not beready for it (it will require many follow-ups).
6. Hierarchy may have objection to assign higher role to efficient person.
7. Weak motivation for training in participants.
Conclusion
From above discussion, it clear that for developing good data processing system, apart
from investment in hardware, there is little monetary investment in software is required.
Real issue in developing a good data processing environment is training. Training for
most of areas are also available on net (even free) Sufficient will and motivation can
lead a social research institute in direction of developing a modern data processing
environment.
8/2/2019 Vision and Plan for Data Processing Center at Social Science Research Institutes
6/7
Annexure-1
Role of Computational Skill in Statistical Analysis
Hurdles in statistical analysis
1. Vague vision regarding statistics- whether it is number or methodology or way
of thinking;
2. Less importance to variation as compared to center of data. The main cause
seems to lack of computational capability. Due to this reason, statistical scale
could not be developed properly;
3. Simulation as a tool of analysis could not get desired importance, again due to
lack of computational skill;
4. Statistical weights based on data did not used for conversion of a unknown
phenomena to a number (use of latent variable), which creates unresolved
disputes;
5. Lack of proper sampling design, restricts to generalize results in right manner;
6. Generally statistical results are interpreted as causal relationship.
Common view on computer based computational capability
1. Required as it works fast;
2. It is useful as it hides mathematical complexity of statistical tools;
3. Obtained results are more accurate;
4. Little computational burden;
5. Only investment is a computer and some feel that a statistical package with skill
to run it is also required.
What is reality
1. It works fast only if data is organized in proper format;2. It hides mathematical complexity but it requires clear understanding of
assumptions and interpretation lying behind statistical tools. Application of
tools without feeling of data may lead to misleading results;
3. It may provide inaccurate, sometimes more disastrous results, if proper steps are
not followed;
4. Yes, it ease the burden of computation, if logical complexities are less and
dataset is large;
5. Apart from investment for computer and skill to run statistical software, skill to
organize data is required.
In fact most of the analyst did not change orientation for data analysis in spite of fast
improvement in computational capabilities. How new framework of analysis should be
different from old one, capabilities required and new concepts emerging due to
availability of power of computational tools can be understood by comparing old
framework of analysis with new one (as follows):
8/2/2019 Vision and Plan for Data Processing Center at Social Science Research Institutes
7/7
Old framework of analysis New framework of analysis
Start analytical work by following
precedence in the area of study
Start analysis with an attempt to know and
feel the data (exploratory data analysis)
Format of analysis is fixed before
planning of data collection
Mixed strategy is followed with more
emphasis on learning from data
Computational skill; and analysis and
interpretation are treated different entity
Needs computational and analyzing skill
in same person
Descriptive analysis is based only on
different measures of central tendency
such as mean, median, mode etc
Apart form studying central tendency of
data, more emphasis is given on variation
in data
Testing of assumptions for use of certain
statistical tools is almost neglected
Testing of assumptions of tools and
transforming the data to meet these
assumptions is given importance
Anything computed is worth for reporting A major part of computation is meant for
understanding and feeling the data
Computational work cannot be reused Reusability is significant part of skill
Believe that analysis start after obtainingthe data
Believe that analysis starts with planningof survey
Missing values and non response is not
given due weight due to computational
problems
Missing and non response can be handled
easily
Sampling design is not important for
developing a model
Sampling design is important for applying
a model
Only those statistical model should be
used which has clear mathematical
solutions
Simulation may be used where analytical
solution is not possible
Understanding of behavior of data in
terms of probability is not much important
Understanding of probabilistic
interpretation of behavior of data isimportant
http://en.wikipedia.org/wiki/Exploratory_data_analysishttp://en.wikipedia.org/wiki/Exploratory_data_analysisRecommended