First Monday

Regional and demographic differences in United States Internet usage

Abstract
An analysis of regional and demographic differences in United States Internet usage by Alan R. Peslak

The Pew Internet and American Life Project collects data on overall Internet usage in the United States. This study reviews data gathered by Pew in December 2002 and tests the overall premise that regional differences exist in Internet usage in the U.S. today. Through Chi–square analysis this report tests whether observed regional and demographic differences in Internet usage are statistically significant. The report first reviews regional and demographic differences separately and finds significant variation across twelve defined regions and ten separate demographic categories. It then reviews demographic differences within regions and tests a series of null hypotheses proposing no significant differences between regions based on the demographic factors. Most of these hypotheses are rejected with noted difference. The report explores other limited hypotheses on the data and concludes with a call for further study to refine the impact of regional and demographic differences in Internet usage in U.S. society.

Contents

Introduction
Review of the literature
Pew Internet Project
Background and methodology
Results
Summary and conclusion

 


 

++++++++++

Introduction

Over the past ten years the use of the Internet and the World Wide Web has literally exploded. From 1992 to 2002, the number of users on the Internet has grown from practically nothing to over six hundred million in the world (Jupiter Media Corporation, 2003). Variations in usage of the Internet among countries, regions and demographic groups seem significant. The United States itself is a diverse population. The growth and usage of the Internet has not been the same across all regions of the country. There appear to be differences in the usage of the Internet both regionally and demographically. But are the differences between regions based on these demographics? The answer to this question is important in electronic commerce marketing as well as understanding if a "digital divide" exists within regions and demographic categories in the United States. This study is an attempt to review differences that have been noted and determine the statistical significance of these differences. This study makes use of data collected on behalf of the Pew Internet and American Life Project. The Pew Trust periodically performs random telephone surveys and collects information on Internet usage in the United States today. These data are made freely available to researchers to perform analyses of trends and findings. Periodically they also prepare and release reports on their data. This report is a direct follow–up to a Pew report issued in August 2003 which reviewed past surveys and compared usage across defined regions of the country and compared demographics of Internet users in these regions compared to the country as a whole. Their report suggests many differences among the regions of the country (Spooner et al., 2003). This report reviews and extends this analysis in several ways.

First, this report reviews strictly current information. The published report examines demographic data from 2001 coupled with regional data from 2000, 2001, and 2002. This report’s analysis deals exclusively with December 2002.

Along with the currency of the December 2002 data, another factor which is relevant is consistency. The data all come from one data set, the December 2002 tracking data set, and thus match both regionally and demographically. There is no mixing of data sets.

The next difference relates to the statistical nature of this report. The Pew report only reviews each region and depicts differences in demographic trends versus the national average in each district (Spooner et al., 2003). This report uses Chi–square analysis to determine whether there are significant differences between regions in each defined regional area of the U.S.

The Pew report deals with only five key demographic factors and reviews their regional differences. These factors are age, sex, income, race, and educational level (Spooner et al., 2003). This report adds five additional factors for analysis: Employment status, marital status, community type (urban, suburban, or rural), student status, and parent status. Thus, this report provides a more complete understanding of regional demographic differences.

Finally, this study begins to review some of the key assumptions in the Pew report and analysis including its definition of regions, and tests the census of the United States regions. It also reviews some nested categories of demographics to determine if some of the differences found in regional differences can be further defined through analysis. As an example the Chi–square test is performed between region, race, and income to determine if exhibited regional differences in race can be explained by income levels within race.

 

++++++++++

Review of the literature

A limited amount of scholarly work has been performed on Internet usage in the United States. One of the first significant articles that attempted to measure Internet use in the United States was written by Thomas Miller in July of 1996. This article began by chronicling the rise of the Internet up to this point including the growth of adult users of the Web from 2.2 million in 1994 to 6.6 million in 1995. Total users over 18 represented 8.4 million users of whom 4.5 million were between the ages of 30 and 49. The study discussed some of the major areas for demographic analysis of Internet usage that have been used. It specifically focused on age, but it discussed the need to measure based on income and education level as well. Overall the study found limited usage of the Internet for students under 18 at that point in time, with only two percent of the population under 18 using the Internet. It proposed as a general concept that young Internet users tended to use the Internet as a communications device whereas older (30 and up) tended to use the Internet as a device for information retrieval. The author also suggested that the older group used the Internet more for finding information than the younger group (Miller, 1996).

Emmanouikides and Hammond (2000) studied Internet usage over the time periods of 1995 through 1997 via surveys conducted by NOP Research Group. Their study examined usage, frequency, locations, applications, and demographics and focused on continuity of use. They found that current users in this time frame were still early adopters or pioneers. They found that Internet use at home or work provided more continuity than those who used public accesses. Finally, they noted that communications were the most popular use, but information seeking and services were the best predictor of recent use. They also found that Internet usage is linear in that the longer someone has been a user, the more likely he is to be a heavier user.

Trochhia and Swinder (2000) studied Internet usage among older Americans via a series of in–depth interviews. They found that many factors hinder greater Internet usage among older Americans. These factors range from resistance to change to physical dexterity. The reasons primarily center on social and physical issues which could be overcome. The authors suggest that older Americans are underrepresented in Internet usage due to these factors, but if these factors can be overcome there may be an excellent marketing opportunity. Older Americans have more free time and disposable income and those who do use the Internet use it for significant amounts of time. Seniors overall, however, are still one half as likely to use the Internet as the general population. The factors holding back greater usage, both social and physical, should be addressed.

A telephone survey study by Katz and Aspen (1997) examined the issues surrounding the use the Internet. At the time of this survey only eight percent of the population considered themselves Internet users. The telephone survey examined some of the common demographics of Internet users at the time. It found that 55 percent of recent users were male (down from 66 percent long–term users), suggesting a trend to more female usage. It found a bias toward youth in terms of Internet use with recent and long–term users both being younger than average in the population. Education played a large role in determining usage, with 76 percent of long–term users being college educated. Overall users were considerably and consistently better educated than the general population. Long–time users of the Internet were considerably better off financially than non–users, with 58 percent of users having a household income greater than US$50,000 compared to around 25 percent of the general population. But the authors saw a broadening trend here as well, with a smaller percentage of recent users in the wealthier category. The study did not find any association of Internet use based on number of children, but it did find significant under–representation based on non–White ethnicity. The study found the top two reasons for using the Internet were communications and finding information. Though there were many users, the respondents still found obstacles to greater usage. The top obstacles were difficulty, ease of use, and cost.

Another early study of Internet use was reported in Communications of the ACM in 1996. Kraut began an analysis of Internet usage and proposed that research be undertaken to determine who uses the Internet and how it is being integrated into society (Kraut, 1996).

Cummings and Kraut (2002) first studied Pew Internet survey data from the time periods of 1995, 1998, and 2000 and attempted to analyze how computer use has changed over the past several years. The authors proposed as a result of the study that Internet and computer usage had become more "domesticated" over this time frame, with higher usage at home versus work and use for personal activities versus work activities. Their preliminary statistical analyses suggested that there were positive trends away from work use to home and personal use, though the authors suggested their conclusions were not definitive.

Thompson (1998) first studied the effect of occupation on Internet usage in Singapore. He used an Internet based questionnaire and classified occupation into three general categories — student, IT employee, and non–IT employee. The study found that for communications, IT personnel use the Internet more frequently than students or non–IT personnel. Daily usage time for messaging however was significantly higher for students and IT personnel than non–IT personnel. For browsing or information searching activities, all groups had similar daily usage time for Internet activity.

Thompson followed up this 1998 study in 2000 with a report on demographic and motivational variables associated with Internet use in Singapore. Again a Web page survey was used. He studied demographic factors and their relationship to four Internet activities, messaging (communications), browsing (information retrieval), downloading, and purchasing. The demographic factors used were sex, age, and education. He found that age is negatively related to messaging, found no relationship between education and any of the four activities, and found a positive relationship between gender and messaging, with females more likely to use the Internet for messaging than males.

Park and Jun (2003) performed a study of Internet usage comparing Korean usage versus average American usage. The study found that the period of Internet usage was longer for U.S. versus Korea. The total hours spent per week on the Internet were higher in Korea, at mean of 18.15 hours versus 11.30 for U.S. This demonstrated a distinct difference in cultural usage of the Internet.

 

++++++++++

Pew Internet Project

All data used in this analysis comes from the December 2002 tracking survey coordinated by the Pew Internet Project. The Pew Internet and American Life Project is the source of the data, and the Project bears no responsibility for the interpretations presented or conclusions reached based on analysis of the data.

The Pew Internet and American Life Project is a non–profit organization that periodically surveys a cross section of the United States to gather data on the social impact of the Internet. The December 2002 survey was conducted between 25 November — 22 December 2002 and represents some of the most comprehensive and current data on Internet use in the U.S. today. The telephone survey of more than 2,400 adults 18 and older was conducted by Princeton Survey Research Associates for The Pew Internet and American Life Project.

The Pew Internet and American Life Project has itself prepared a report on prior data related to the demographics of Internet use and the regional differences in Internet use. On 27 August 2003 the Pew Project issued a report entitled "Internet Use by Region in the United States" in which it first defined 12 separate regions of the United States and then reported on Internet usage in each of the twelve regions (Spooner et al., 2003). Based on older data from 2000, 2001, and 2002, it presented tables for each region that showed what percentage of the population in each region used the Internet by five primary demographic categories: Sex, race, age, income, and education. The tables showed regional usage for each factor and compared these percentages for averages in the United States. This work is an extension of the study.

 

++++++++++

Background and methodology

This report extends the Pew Internet Project work in several ways.

  1. It provides a standard statistical process, Chi–square analysis to determine whether differences observed in regional Internet usage are statistically significant.
  2. It analyzes five additional demographic factors, employment status, marital status, community type (urban, suburban, and rural), student status, and parental status to the five demographic factors in the Pew report for a more comprehensive regional demographic analysis.
  3. It uses exclusively current data.
  4. The study tests census region results versus the Pew defined regions to attempt to determine potential regional designation issues.
  5. Finally, it begins to test multiple levels of demographics to start to see if there are sub–demographic regional differences.

As a short additional exercise, an age demographic study was performed to determine whether just e–mail usage produced different results for regional variation.

The twelve regions identified by the Pew study (Spooner et al., 2003) and carried forward in this report are as follows:

California was designated separately due to its unique nature relative to bordering states.

As noted, the Pew report suggested identifiable differences among the regions relative to the national average. The methodology used in this study was to perform Chi–square analysis to determine if there were statistically significant differences between regions. Pearson’s Chi–square analysis tests the proposition that there is no association between columns and rows in tabular data. If the p value is less than .05, it can be generally assumed that the null hypothesis, that the row value is unrelated to the column variable, is rejected. In this case the null hypothesis is generally that there is no significant difference between the regions for each factor tested. SPSS 10.0 was used to analyze the 2,463 record dataset to determine if the null hypothesis — that there were no significant differences between regions for each demographic factor — was rejected. A significance factor of less than .05 was used in this study to reject the null hypothesis that there was no significant difference for the demographic factor between regions. In other words, there was a statistically significant difference between regions. Based on answers to the following question, "Do you ever go online to access the Internet or World Wide Web or to send and receive e–mail?" the null hypotheses tested were:

Hypothesis 1. There is no significant difference in Internet usage between regions of the United States based on age.

Hypothesis 2. There is no significant difference in Internet usage between regions of the United States based on race.

Hypothesis 3. There is no significant difference in Internet usage between regions of the United States based on educational level achieved.

Hypothesis 4. There is no significant difference in Internet usage between regions of the United States based on income level.

Hypothesis 5. There is no significant difference in Internet usage between regions of the United States based on sex.

Hypothesis 6. There is no significant difference in Internet usage between regions of the United States based on employment status.

Hypothesis 7. There is no significant difference in Internet usage between regions of the United States based on community type (urban, suburban, or rural).

Hypothesis 8. There is no significant difference in Internet usage between regions of the United States based on student status.

Hypothesis 9. There is no significant difference in Internet usage between regions of the United States based on parental status.

Hypothesis 10. There is no significant difference in Internet usage between regions of the United States based on marital status.

Hypothesis 11. There is no significant difference in Internet usage between census regions of the United States.

Hypothesis 12. There is no significant difference in Internet usage between regions of the United States based on age within race.

Hypothesis 13. There is no significant difference in Internet usage between regions of the United States based on income within race.

Hypothesis 14. There is no significant difference in e–mail usage between regions of the United States based on age.

 

++++++++++

Results

The first step in the analysis was to perform Chi–square analysis on each of the demographic factors themselves. This was done to determine whether there was a statistically significant difference within the demographic factors throughout the United States. The results are shown in Table 1.

 

Table 1: Statistically significant demographic factors in the United States.

Factor Significance level Example of category Percentage using Internet
in category
Age .000 18–29
65+
71.9
19.7
Race .000 White
Black
58.9
45.2
Educational level .000 None or 1–8
Post–graduate
6.7
83.7
Income level .000 <US10,000
>US$100,000
33.2
87.3
Sex .043 Male
Female
59.3
55.3
Employment status .000 Full–time
Unemployment
66.0
56.5
Community type .000 Urban
Rural
62.2
46.0
Student .000 Full–time student
Not a student
86.2
52.7
Parent .000 Parent
Not a parent
65.2
52.8
Marital status .000 Married
Divorced
61.0
55.2

 

All the factors exhibited significant differences for their factors. The table shows examples of categories and their variation in the U.S. as a whole.

Next, an analysis was performed for Internet usage across just the regions to determine whether overall there was a difference in Internet usage based on the Pew study defined regions. The Chi–square analysis was performed using SPSS 10.0 and the null hypothesis that there was no difference in Internet usage between Pew defined regions was rejected. The significance was .000. The summary of variation by region is shown in Table 2.

 

Table 2: Regional summary of Internet usage.

U.S. region Percentage using Internet
New England (Connecticut, Maine, Massachusetts, New Hampshire, Vermont, Rhode Island) 71.3
Mid–Atlantic (Delaware, New Jersey, New York, Pennsylvania) 53.3
National Capital (Maryland, Virginia, Washington, D.C.) 64.7
Southeast (Florida, Georgia, North Carolina, South Carolina) 45.8
South (Alabama, Arkansas, Kentucky, Louisiana, Mississippi, Tennessee, West Virginia) 51.1
Industrial Midwest (Illinois, Indiana, Michigan, Ohio) 53.2
Upper Midwest (Minnesota, North Dakota, South Dakota, Wisconsin) 59.0
Lower Midwest (Iowa, Kansas, Missouri, Nebraska, Oklahoma) 56.4
Border States (Arizona, New Mexico, Texas) 61.0
Mountain States (Colorado, Idaho, Montana, Nevada, Utah, Wyoming) 63.7
Pacific Northwest (Oregon, Washington) 72.2
California 64.6

 

The next step was to actually perform the demographic analysis per region and determine via Chi–square analysis whether the null hypothesis was supported or rejected for each of the demographic factor variable within each region. The following are the results of the statistical analyses to test each hypothesis.

Hypothesis 1. There is no significant difference in Internet usage between regions of the United States based on age — Rejected.

All age categories, 18–29, 30–49, 50–64, and 65+ showed significant difference between the 12 Pew defined regions. This means that different age categories exhibit different Internet usage depending on their home region of the United States. Detailed regional analysis for this and the first ten hypotheses are available via e–mail from the author. As an example, even though the over 65 age group showed the lowest percentage overall, regional differences ranged from 11.4 percent in the South to 31.6 percent in New England. This type of demographic information can be particularly useful to Internet marketers.

Hypothesis 2. There is no significant difference in Internet usage between regions of the United States based on race — Rejected for most.

Nearly all categories of race showed significant variance between regions at the .05 significance level — White, Black, Asian, and Mixed Race. The only exception to this was Native American, which did not show significant difference between regions.

Hypothesis 3. There is no significant difference in Internet usage between regions of the United States based on educational level achieved — Rejected for most.

Again all categories showed significant variance between regions including None, or grades 1–8, Incomplete High School, High School Graduate, Some College, Business/Technical, and College Graduate. The exception to this was the Post–graduate category, which did not show significant variance between regions.

Hypothesis 4. There is no significant difference in Internet usage between regions of the United States based on income level — Rejected for most.

All income levels rejected the null hypothesis except the upper income level of $100,000 or more which did not show significant regional variation.

Hypothesis 5. There is no significant difference in Internet usage between regions of the United States based on sex — Rejected.

Both males and females exhibited statistically significant variation between regions at the .000 level.

Hypothesis 6. There is no significant difference in Internet usage between regions of the United States based on employment status — Partly rejected, partly not rejected.

This hypothesis showed split results based on employment status. Part–time employees and disabled employees did not show regional variation. The null hypothesis was accepted for full–time employees, retired employees, and unemployed, showing regional variation.

Hypothesis 7. There is no significant difference in Internet usage between regions of the United States based on community type (urban, suburban, rural) — Rejected.

The community types of urban, rural and suburban all showed different Internet usage rates based on the twelve Pew defined regions.

Hypothesis 8. There is no significant difference in Internet usage between regions of the United States based on student status — Rejected.
Likewise, full–time, part–time, and non–students all showed regional variation at the .05 significance level.
Hypothesis 9. There is no significant difference in Internet usage between regions of the United States based on parental status — Rejected.

Internet usage based on parental status was not independent of the region of the country.

Hypothesis 10. There is no significant difference in Internet usage between regions of the United States based on marital status — Rejected for most.

Most categories of marital status showed regional variation including married, living as married, divorced, widowed, and never been married. The only exception was "separated" which did not show significant regional variation in Internet usage.

Hypothesis 11. There is no significant difference in Internet usage between census regions of the United States based on income — Not rejected for most.

This hypothesis was a test of the regional selection of the Pew Internet Project. An analysis of the census regions of Northeast, Midwest, South, and West was performed to determine regional variation based on income. The Chi–square analysis showed unusual results, accepting the null hypothesis of no regional variation of Internet usage based on income categories for less than US$10,000, US$20–30,000, US$40–50,000, and US$50–75,000, but rejecting for the categories US$10–20,000, US$30–40,000, and > US$100,000. These results are most unusual and suggest the more refined Pew classifications to be preferred.

Hypothesis 12. There is no significant difference in Internet usage between regions of the United States based on age within race — Rejected for most.

This hypothesis tested whether there were significant sub–variations when the analysis was drilled down one level further. The Chi–square tested whether within race (which overall had shown regional variations), there were regional variations based on age. For most categories, this proved to be true. The exceptions where no regional variation occurred were Whites between the ages of 30–49, Asians between the ages of 30–49, and blacks over 65. These subcategories showed no regional variation.

Hypothesis 13. There is no significant difference in Internet usage between regions of the United States based on income within race — Mixed results.

For the following categories the null hypothesis of no regional variation was accepted at the .05 level, Mixed race US$10–20,000, Blacks US$20–30,000, Asians US$20–30,000, Blacks US$30–40,000, Whites US$40–50,000, Mixed race US$50–75,000 and Whites US$100,000 or more. All other categories of race/income showed significant regional variation. Interestingly, only whites over US$100,000 income showed regional variation. Other races did not.

Hypothesis 14. There is no significant difference in e–mail usage between regions of the United States based on income within age.

Finally, this hypothesis was a test to isolate only e–mail usage as opposed to all Internet uses and found that for all age groups e–mail usage by age does have statistically significant regional variation.

 

++++++++++

Summary and conclusion

This study generally supports the limited research done on demographic analysis of Internet usage with some exceptions. Trocchia and Swindler (2000) found older Americans underrepresented in Internet usage. This finding was supported in this study and it found regional differences over and above age differences. Katz and Aspen (1997) found a bias toward male users but a growth among female users. There still remain differences due to gender, and these differences carry over into regional analysis. The overall Cummings and Kraut review of the Pew Internet survey (2002) was supported and successfully statistically tested.

As noted, Miller (1996) proposed as a general concept that young Internet users tended to use the Internet as a communications device whereas older (30 and up) tended to use the Internet as a device for information retrieval. This significant demographic difference was tested seven years later in this study with the Pew data, and not supported. Miller suggested that the older group also used the Internet more for finding information than the younger group. This finding was also tested in this study seven years later and it was still found that older individuals — as defined by 30–49 versus 18–29 — did have a statistically significant difference in usage of the Internet to find information about services and products. The younger group used the Internet only about 25 percent of the time whereas the older group used it about 50 percent of the time. This is in sharp contrast to the analysis for e–mail which showed an approximate equal percentage of 50 percent of the time.

Overall, the results of these analyses support the idea that in the United States there is very strong regional variation in Internet usage. Nearly all of the null hypotheses of no regional variation were rejected. The exceptions suggest a few areas were there is commonality across regions. These common demographic groups which do not vary based on region of the country are shown in Table 3.

 

Table 3: Factors and categories which do not show U.S. regional variation.

Factor Category
Race Native Americans
Education Post–graduate education
Income
(but further analysis showed restriction to Whites)
Income over US$100,000
Employment status Part–time employees
Disabled employees
Marital status Separated

 

Other regional breakdowns (census) showed unusual results, validating the Pew regional breakdown. A limited attempt was made to further refine the sub–demographics of the data. Also an attempt was made to subcategorize Internet usage. Limited insights were found here and further study is suggested. Overall, it has been statistically established that significant regional and demographic differences exist in Internet usage. End of article

 

About the Author

Alan R. Peslak is Assistant Professor, Information Sciences and Technology, at Pennsylvania State University in Dunmore, Pa.
E–mail: arp14@psu.edu

 

References

J. Cummings and R. Kraut, 2002. "Domesticating computers and the Internet," Information Society, volume 18, pp. 221–231.

C. Emmanouikides and K. Hammond, 2000. "Internet usage: Predictors of active users and frequency of use," Journal of Interactive Marketing, volume 14, number 2, pp. 17–32.

Jupiter Media Corporation, 2003. "How many online?," at http://www.nua.ie/surveys/how_many_online/, accessed 2 December 2002.

J. Katz and P. Aspden, 1997. "Motivations for and barriers to Internet usage: Results of a national opinion survey," Internet Research, volume 7, number 3, pp. 170–188.

R. Kraut, 1996. "The Internet @ Home," Communications of the ACM, volume 39, number 12, pp. 32–35.

T. Miller, 1996. "Segmenting the Internet," American Demographics, volume 18, number 7, pp. 48–52.

C. Park and J. Jun, 2003. "A cross–cultural comparison of Internet buying behavior," International Marketing Review, volume 20, issue 5, pp. 534–553.

T. Spooner, P. Meredith, and L. Rainie, 2003. "Internet use by region in the United States," Pew Internet & American Life Project, at http://www.pewinternet.org, accessed 28 November 2003.

S. Thompson, 2000. "Demographic and motivation variables associated with Internet usage activities," Internet Research, volume 11, number 2, pp. 125–137.

S. Thompson, 1998. "Differential effects of occupation on Internet usage," Internet Research, volume 8, number 2, pp. 156–165.

P. Trocchia and J. Swindler, 2000. "A phenomenological investigation of Internet usage among older individuals," Journal of Consumer Marketing, volume 17, issue 7, pp. 605–616.


Editorial history

Paper received 21 January 2004; accepted 12 February 2004.


Contents Index

Copyright ©2004, First Monday

Copyright ©2004, Alan R. Peslak

An analysis of regional and demographic differences in United States Internet usage by Alan R. Peslak
First Monday, volume 9, number 3 (March 2004),
URL: http://firstmonday.org/issues/issue9_3/peslak/index.html