Myer’s Briggs

“Do you think personality type has some impact on how we learn, use, and think about information technologies?”

LS560 posted the above question in Module 3b. I hadn’t thought about the relationship between IT and personality types before, but just seeing the question pulls back a curtain into a window of information.

From the website: http://www.humanmetrics.com/personality/type

Thinking – Feeling, represents how a person processes information. Thinking means that a person makes a decision mainly through logic. Feeling means that, as a rule, he or she makes a decision based on emotion, i.e. based on what they feel they should do.

The fourth criterion, Judging – Perceiving, reflects how a person implements the information he or she has processed. Judging means that a person organizes all of his life events and, as a rule, sticks to his plans. Perceiving means that he or she is inclined to improvise and explore alternative options.

How could these personality traits not factor in the sequential, but also constantly changing nature of IT? I could identify it in myself as I struggled with CSS. I wanted rules and a plan (can you tell I am a Judging?)  I also sometimes just went with ‘what feels like it might work’ (can you tell I am an Feeling?) In my younger years I was more of a Thinking than and Feeling. Is the current political climate and uncertainty or other life factors influencing me? Will I swing back to Thinking from Feeling in the next era of my life?

This question was very useful in my role of making new Excel tools at work and then ‘selling the idea of them’ to others who have to use the tools. I approached each person differently, and I can now tie it to their personality. Would be fascinating if we all took the test and I could fine tune a separate approach to each that would be the most helpful for the individual involved.

Advertisements

IT Fluency change – week 2.5 LS560

I am appreciating this learning curve. I am beginning to view IT as a learnable objective instead of something I want someone else to ‘make work’ for me so that I can produce something.

My IT skills and comfort level have increased since I started. I am learning the web page language of HTML5 and understand some of it. The intimidating mystery of it diminished with the simple act of typing a text document that a browser then converted for me into a page with color and style. I have found my overall confidence in being able to handle what comes at me increasing.

I am not the fastest at acquiring new IT skills but I have an appreciation for both its structure and the way that it is constantly changing. I can understand why there isn’t a textbook with step by step instructions for us, because by the time it is published some of would be obsolete. I understand that SGML, HTML, HTML5, XML, XHTML are family members, different ages, different levels of development. I understand that these languages become less useful over time as new standards are developed and browsers at first have trouble keeping up, then later have trouble reaching back.

I am spending more time exploring my own computer, browsers and searching for related information. I feel a greater desire to learn and explore. I am not ahead on any of the work, it is all I can do to keep to the schedule, especially when I go down a rabbit hole of ‘how would you do that.’ But I like it and it is going OK.

At my job, I had to create an Excel spreadsheet to track some basic information for our branch libraries. The users are extremely varied (everything from OK to scared to death of IT in any form.)

I gave a lot of thought to the end users and ran it by different users at stages of its development. By researching I was able to find and add features such as protecting sheets while unlocking cells. After my beta test with the users I also researched enough to add a shared user feature. All through the process, I kept up a ‘jolly along’ factor with the users to lower their stress level and so that they would feel comfortable calling to ask questions. There were some hilarious missteps, but I got to function as a simple phone IT service provider this week and we all felt great at the end of implementing this new program. I loved thinking of how to make it a success from the day I got the project and managing the change that they would see this as.

I like that IT is losing its mystery. It is all out there. I can’t learn it all, but I can learn some of it and then keep on learning.

I don’t know if my parent’s generation thought that all they had to do was ‘learn their job’ and then just do that for the rest of their lives, but that is my impression of how they thought. That isn’t a thought that is workable in this day and age. For a human to continue to grow and truly live it never was.

85 year old IT use – chat/interview

Interview/chat with an 85 year old retired male / Interview distilled from extensive conversation. Observations at the end of this post.

What is the first information technology you remember using?

Answer: Riding in a car

Discussion: Basic overview of information technology explained. (Note: In introductory discussion of the interview a clear definition of what is meant by information technology would have been helpful before starting.)

New Answer: When I was working for juvenile detention facility as a teacher in the 1980’s , we had a class on using a computer to keep records. The instructor was unable to make it work, and had to leave. When we got the program later it tended to ‘lose’ the information and was not reliable. It was clear from further conversation that early Information Technology was perceived as cumbersome and unreliable.

IT SKILLS

What information technology do you own and use? TV and Cell phone. This led to a discussion about smart phone vs cell phones. New Answer: Smart TV and Smart phone. More discussion revealed he also has a computer (laptop.)

 PHONE SPECIFIC QUESTIONS

What kind of smart phone is it? Apple, at least I think it is, there is a picture of that on it.

Why did you choose this phone? It was part of a plan – this was the upgrade.  Elaboration: It was provided by a family member as part of their plan and given to the interviewee with a case. It was set up by multiple family members, mainly one son in law.

What functions do you perform on your smart phone? Phone calls and looking up phone numbers. I use it as a personal phone book. He then looked at his phone and added these uses: Weather report, Map, Calendar, Photo storage, clock, calculator “it simplifies math” and compass. Upon prompting, he added that he does text a little.

Upon further questioning: “I get information from Google” an estimated 3 times a day and take pictures “once in while.” He mostly receives pictures, “moving pictures as well as stills,” and keeps in touch with his family this way. He can sometimes attach pictures to his texts but not reliably. He can sometimes take pictures, but not reliably and cannot store photographs sent to him. He uses Facetime once in a while but is not too comfortable with it, and usually needs help.

What do you like most about your smart phone? Convenience, protection/security – “safety tool in case I fall”

What don’t you like about it? “Finding information on Google,” “Information overload on Google” “It’s hard to get a quick answer, you can get miles of answers.” Added: Communication has improved so much it is hard to keep up with it.

SMART TV SPECIFIC QUESTIONS:

Interviewee watches multiple news channels for information and watches a lot of entertainment TV also. Really enjoyed talking about the TV and how far it has come. Really difficult to keep this on IT as a main topic.

Volunteered: “Very happy with that,” “TV has gotten so good and so advanced hard to imagine it getting any better” Other than: “they don’t have local sports, which kind of pissed me off and they have too many other things on there that I am not too interested in, see all the channels they have, it takes me 15 minutes to go through and find all the shows I don’t want to watch.”

What kind of TV is it? Unknow (this interview was being conducted in from the said TV that was muted.). Turned on bright lights and made another family member read the label on the TV set and together they came up with the answer ‘Vizio.’ This took some time, and they seemed a little unsure if that was the correct answer.

What does a smart TV do? “Unlimited channels and up to date programs” “they are timely. meaning you can watch them on your schedule.” Note: Local cable features are seen as part of the smart TV features. Smart TV features are accessed and used by other members of the family on behalf of the interviewee, he is unaware of how they function and unable to use them himself.

COMPUTER SPECIFIC QUESTIONS:

What kind of computer do you have? “I don’t know, I bought it for company, people want to use it, I’ve never turned it on.” Interviewee also has a printer for the same reason, is not able to use that either.

INFORMATION TECHNOLOGY QUESTIONS:

Which new information technology have you found most helpful in your life? Television – “It fills in a lot of blank time for a retired guy.” Questioning revealed that he watches news and Bill Maher and gets his football results on his TV.

What about information technology do you find to be the most annoying? Texting instead of talking to the group around them.

What is greatest challenge in using technology for you? Becoming expert in its use. (see observations below for additional observed challenges.)

How do you approach that? Take a break and wait for the expert (this is the previously mentioned son-in-law) to come in and do it for me.

How do you solve IT problems? Give it previously mentioned son-in-law who straightens it out, “just like tonight on the TV set” by which he means the cable box.

Have you ever used your phone to solve a problem? “Lots of times… to find the costs of things… communication for information… called for help when my hip out…” More complex problem solving isn’t done.

 

When asked his overall impression of information technology as he understands it:

“I like it now better – it’s a hell of an improvement, like in early 50’s, predictions so far beyond that…” “Back then it went into the 3 channels, you had to have a screen in front of TV, 13 settings 3 worked such a poor picture, now look at this oh my god” as mentioned before the interview was taking place in front of TV with mute on.

First TV sat on a box like radio with thing in the middle, Impossible to get a picture this good. Finally catching up to where it said it would, double the ability of information so fast it is impossible to catch up, lot of places are so far behind its no use to have it. Upon questioning he elaborated that this would be remote tribes etc.… and our poor areas. Concluded the interview here.

OVERALL IMPRESSION A RESULT OF THIS INTERVIEW:

The interviewee has some basic information technology skills. His view and interest in learning IT skills seems partially colored by the complexity and unreliability of his early experiences and partially set by not being interested in how things work in general. He exhibits a willingness to accept the usefulness without desiring to understand any of the underlying function, which severely restricts his ability to use IT. He approaches it more like driving a car but not knowing how it works.

He is able use IT concepts to find some basic information through SIRI/Google by using his smart phone. NOTE: Having helped with some of the early learning curve on this, the initial period of use was complicated by him greeting SIRI and introducing himself to her at each encounter. He doesn’t relate to computers as algorithms working independently of humans. My impression was reinforced by overhearing a computer call in which the interviewee had an extensive and frustrating conversation with a computer. He can articulate the difference, but is patterned to respond as if it was a human if it has a voice.

When asked about how he searched Google for information, he understood that you search by the name or “whatever you are looking for.” He understood some of the basics of keyword searching but not in any detail.

In the realm of intellectual capabilities, he is able to use IT to solve some very basic information needs but not to solve a problem of any complexity. He was able to call for help when his hip went out, so he was able to solve a problem but it was not by applying IT but by using IT.

In interviewing an older person, I would budget a lot more time and more follow up time also. Some of the basic definitions for questions are not understood and I needed to rethink some of the questions in order to ask follow up questions that could be understood and answered. I think if I were to ever design a large scale survey, it would be helpful to do trial runs on my chosen age group to refine questions prior to beginning, since the survey would need to be consistent in questions and response. A basic clarification of what is meant by my core terms, in language understandable to the interviewee would also be helpful as well as challenging.

After this interview, I wonder if a reason digital immigrants are less knowledgeable and enthused about IT is that they experienced a long and formative adult period where a large investment of time and energy resulted in a small payoff in IT usefulness at best (at worst there was a net loss in efficiency and time savings.) Maintaining enthusiasm for learning and growing with IT took a level of vision and enthusiasm that would not be logical for the average human. If you go to the same store for ten years and it is painful every time you shop there, you are likely to never want to go there again and have any mention of the store bring up negative emotion and reaction. Older digital Immigrants have a lot to get over, I feel more positive about the steps they do make after this interview. They were promised Buck Rodgers and instead given two hours of 1s and 0s that resulted in the outline of a rocket ship lifting off (most of the time if the computer didn’t crash) no wonder it didn’t set their world on fire, that would have necessitated a grand vision and optimism that most people don’t possess.

An observation, not elicited through the interview but observed during it, is that manual coordination, dry fingertips and slow reaction time make some functions impossible to perform reliably and leads to further frustration and unwillingness to engage. An example is the interviewee’s Smart TV source menu that timed out when he was observed using it. The time it took to read, understand and react with a physical movement was too long to accomplish a switch from one source to another before the function timed out and the choice was ‘lost.’

I wonder also if a mechanical world viewpoint, with definite ‘on and offs’, makes it harder for digital immigrants to gain comfort in a digital world that includes loading delays and processing time. With the already difficult dry fingers and lower coordination it may be easy to just assume it ‘isn’t working’ and needs to be poked again, and again, and…

Overview of me

I work at a public library and the reference desk is part of my job. I use what I consider ‘regular’ technology like e-mail, Word, Excel, Photoshop, InDesign, Publisher etc… as well as various proprietary software programs. I use a variety of search engines. I like online learning, and think Khan Academy is a wonderful tool. I have used Lynda.com and others like it also.  I have a Twitter account and use Flipboard, I like the ability to curate my own experience and choose sources. I use Facebook, though mostly for class now. I feel that I can find most information online, but look forward to becoming more discerning.

I am a digital immigrant and like my magazines, news and books online and on paper. I have a laptop, pad, desktop, smartphone, and GPS.

It feels like I have just ‘interviewed myself.’ I look forward to becoming IT Fluent and know I will continue to work on it after this class ends.

My next blog will be the results of my chat with an 85 year old user of technology.

wp-1471791033678.jpg

Article Summary for Lecture #11 – Barite

The Notion of “Category” Its Implications in Subject Analysis and in the Construction and Evaluation of Indexing Languages

The author points out that historically the approach to the notion of category has been concerned with an intellectual analysis and metaphysical approach. From Aristotle to Ranganathan, the approaches have been re-conceptualized theories of the organization of all knowledge by experts who didn’t question the basic notion and validity of approaching the theory of category from that angle. The author questions the applicability of such an abstract notion of cataloging in the organization of knowledge and whether it has been established “plainly and clearly, why and what for categories are useful in our disciplinary field?” The author feels that the underlying theory of category has been unexamined, much like any theory that is ‘accepted’ for a long enough period of time. He proposes the reexamination of the notion of cataloging from a functional and instrumental perspective.

He begins with an overview of the theory of cataloging, walking us through views from Aristotle to Ranganathan. He feels they may have been so close to the subject that they didn’t feel a need to provide a clear explanation of the principles, statements and inferences of cataloging theory. Along the way he expresses frustration that the notions of category, characteristic and class are often used in too indistinct a way. That this blurs the clear differences between them in much of the literature.

He goes on to define categories as simplified abstractions used by classificationists to investigate regularities and represent notions. The point of them is to logically organize systems of concepts for organizing knowledge in general terms and subject analysis in specific terms.

Moving away from Ranganathanian theory he concludes that categories are of interest to our field as instruments of analysis and organization. They are not of interest as metaphysical concepts but as levels of analysis applied to human knowledge and their representative abstractions.

He details category’s usefulness to subject analysis and indexing in three precise activities, the design of indexing systems, the modification of classification tables and the evaluation and analysis of indexing languages. The ‘notion of category’ facilitates the subject analysis process for indexers in their establishing of subject precedence and hierarchy. He deconstructs the notion of category into: its sectorial essence; its implied specific level of analysis; its external relationship to its object; its exclusion of other categories and its generalizable nature.

He discusses the interrelatedness of category, the object and the analyst and their connected function in transferring disciplines and concepts to indexing language.

He discusses how every domain has its own conceptualized structure that determines the choice of the application of specific categories and subdivisions to it. He discusses the subdivisions of categories known as characteristic, facet or attributes which are further analysis of the categories.

Although there have been centuries of theorizing, agreement has not been made on a limited collection of categories, and points out that the number of categories increases in inverse proportion to the degree of generality aimed at. Continuing his practical view, he feels that decisions are linked to the need or utility of the analysis.

He concludes that his revisionist approach may need more specific application to determine its adequacy and relevance. He also recommends study of the functional-instrumental perspective that could contribute more to the understanding of category. He ends by advocating for attention to this topic as it entails essentials to both the practice of indexing and the educational knowledge passed on by lecturers on the subject.

This article made sense to me, but I am not sure if that is because being an early student to LIS I get bogged down in the philosophical mazes of catagorizing viewpoints. I see that his approach, if valid, could be taught more easily than other approaches to catalog theory. I appreciate that he looks at the foundational ‘given’ of cataloging theory and questions it. I would like to see if the ‘more specific application’ that he proposed for his theory has been done and what the result was. This is an article that makes me excited for our next two classes, as a lot of information is needed before I can discount centuries of learned theory in favor of a utilitarian one, not matter how much I like it. I can’t yet make the jump to the idea of not having a philosophical underpinning to categorizing.

Article Summary for Lecture #10 – Anderson

The nature of indexing: how humans and machines analyze messages and texts for retrieval. Part II: Machine Indexing, and the allocation of human versus machine effort.

This is the second half of a two-part essay detailing the strengths and weaknesses of human and computer indexing. This part two focuses on computer indexing which the authors define as ‘the analysis of text by means of computer algorithms.’ It concludes with the author’s opinion of the most cost and resource efficient division of labor for human and computer indexing.

The authors argue for the expanded use of computers as an aid to indexers and as the provider of initial indexing for all documents. They feel that the role of human indexers should evolve into a more qualitative function additionally serving as an ‘expert opinion’ or ‘readers guide.’ They propose a system for identifying documents that are more significant and using additional human indexing on them, serving to make them available to a broader group.

They begin with a disclaimer that they will not be addressing user input other than relevance feedback. They mention the history of automatic indexing and two theoretical models of indexing then quickly refocus on their main topic, the automatic indexing of language texts in English.

They start with a discussion of how computers ‘see’ a word and how that is not as straightforward as one might expect. They cite examples of multiple punctuation issues that change the meaning of words and the walk us through writing rules to deal with these issues. They ending each discussion with realistic ‘best answers’ rather than unattainable ‘perfect answers.’

They go through the steps of automatic indexing from the simplest steps to the most complex, including  the strengths and weaknesses of each. They cover keyword indexing, negative vocabulary control, word frequency, counting and weighing, stemming, and analyzing or parsing grammatical structure to disambiguate or further index. They discuss the new hybrids that use multiple approaches and highlight ‘Keyphind’ that uses syntactic and statistical analysis along with stemming. They conclude that this approach is excellent but too expensive and time consuming. I can see from my other readings this week that it is now part of sophisticated search engines. This article was written when that approach was in its infancy.

Having opened the door to new approaches, they continue with a discussion of clustering or identifying grouping of items, and latent semantic indexing (LSI) which identifies highly related terms and identifies documents related to them. They speak to LSI being an attempt to get at the ideas being expressed much like human indexers. They conclude with citation indexing, bibliographic coupling and co-citation.

The only user analysis element they discuss is relevance feedback through its beginnings with actual queries of a user to the more advanced approaches that ‘read’ user response. They discuss indexing and abstracting services and their use of humans for both complex analysis and to create IR databases.

They see an increased need for human indexing and knowledge organization that can only be met by increasing our use of computer support for more mundane tasks. They make the case that we should use computers as aids to perform initial indexing and for additional functions such as presenting human indexers with prompts, check tags and reminders.

They advocate:

  • Allocate humans and computers to areas where each are most effective and efficient
  • Stop treating all documents as worthy of complete indexing
  • Develop ways to sort documents into ‘computer indexing only’ and ‘computer plus human indexing’ groupings. Apply computer indexing to all documents and further human indexing to only the 20% or so that are ‘worth’ it. They have an initial guideline for determining this importance of a work: use, citation, publisher prediction, review and awards, searcher nomination, advisory boards and indexer nomination. They address ways to limit bias and add balance.

They feel that indexers should evolve to function as helmsmen by adding qualitative judgement to their indexing. They cite ‘Bradford’s Law’ that explains the inundation by irrelevant material that occurs in intensive searches. They use this to illustrate a family of ‘issues created by computer indexing and searching’ that humans should focus on.

This article is a great grounding step in human versus computer analysis though written much earlier in the automatic indexing timeline than we are now. They detail most of the steps in automatic indexing used now, though the algorithms are now sixteen years more advanced. It left me feeling, if we have come as far as we have already and Google has been using AI since 2015, that automatic indexing is a great tool. Automatic indexing needs humans to constantly update algorithms, check results, correct weak areas, look for bad actors, and guide development, but with the trillions of documents on the web we must be the masters and not the drones.

They came to their conclusions before computer indexing was as useful or cost effective as it is now. Their support for a blended approach and the use of humans in complex information documents is still useful today. This article was written in 2001 and our reading this week included a 2000 to 2016 summary of major changes in in Google algorithms across that time. Those readings, plus watching an actual Google indexing meeting, made me realize that humans are doing much of what the authors advocate for such as constantly refining and working on better computer analysis. I don’t see that human indexers have moved into a clear qualitative advisory role that the authors espouse however there are choices made by humans on what to index so it somewhat fulfills what the authors advocate.

Perhaps the biggest obstacle I see is that adding more computer help didn’t free up a wealth of indexers to become this next generation of helmsmen. Not only did the information volume explode, but reading Bates article After the dot-bomb, perhaps there were many areas where human indexing and knowledge experts could have been used that they either weren’t invited to or didn’t make a strong enough case for their own inclusion. Are we in a period of regrouping where the value of indexers and classifiers are acknowledged and sought out or has the computer field advanced so far that the language each group speaks is too dissimilar to perceive that we work toward the same end and have much knowledge to contribute?

LS501’s readings this week address the need for data curation, digital media’s instability and libraries being among the only institutions with the capacity to accomplish this curation. The readings also speak to the increasing need for authors to publish compendium with their papers to ensure the scientific standard of reproducibility and that many scholars do not have the required skill sets to address these processes and requirements. If we don’t answer these needs society will find another provider. The call to partner seems clear from my readings, I look forward to learning more about our profession’s answer to this call, even my reading of UA’s MLIS catalog is now done with more open eyes.

Article Summary for Lecture #9 – Anderson

Nature of Indexing: How humans and machines analyze messages and texts for retrieval. Part 1: Research, and the nature of human indexing.

This essay was co-written in 2000 by authors from two different domains, one a human cataloger/indexer/teacher and one a computer scientist. This is the first part of a two-part essay in which they delve into the differences in human and computer analysis and the best allocation of each in indexing. They review opinions and research by experts on both sides of the divide. Part one delves specifically into human analysis. They explore the nature of human analysis and the inadequacy of research about it. They conclude with an overview of writings on the value of formulating rules for human indexing and the difficulties involved in that.

They sketch the background of both parts of the essay by explaining that ‘user retrieval need’ requires an Intermediate Representation database (IR.) An IR requires that the messages and text be analyzed to be able to be used. Setting the stage for both parts of their essay, they briefly outline that humans perform this analysis by considering what message the document or text represents and including features of the document or text and computers perform this analysis by comparing the symbols or components of the texts, applying outside lexicons, thesauri, or other contextual data to characterize ‘sets’ and may use pattern algorithms for large units of texts. We have learned that human indexers use domain specific guidelines, controlled vocabularies and are often learned in the specific field they index in. Here, and in their later discussion of the ‘cognitive approach to indexing,’ the fact that humans also ‘applying outside lexicons, thesauri, or other contextual data’ appears to be overlooked and makes those areas questionable.

They state the two main components of any retrieval system are the user and the IR system, but add that there are other key variables such as size of documentary units, extent of indexable matter, exhaustivity, specificity and other aspects so numerous that they are problematic to isolate and analyze. They feel that research in human versus computer analysis has not been effective because ‘associated variables not being isolated or controlled.’ By its nature computer analysis lends itself more easily to measurability than human analysis.

The essay focuses specifically on the analysis of messages or texts. Their footnote defines this as “‘Message’ is used for the ideas, thoughts, emotions, or knowledge that a creator intends to convey to other people. ‘Text’ is used in the semiotic sense for the organized set of symbols chosen to represent the message.” This essay specifically addresses language text.

They note that “about everyone” agrees that there are two basic steps in human indexing the first is an analysis that results in the creation of a “notion” and the second is converting that “notion” into the indexing language or format required by the IR database (or design of the index.)  They address in detail the lack of rules for the first step and the larger set of rules that exist for the second step.

Using multiple expert’s writings they launch into an investigation of the prevailing view of indexing as a cognitive process ‘that cannot be reduced to a set of rules’ in opposition to Frohmann’s view that the cognitive process of indexing is over emphasized. They continue with support for the idea that indexing needs a better set of constructed rules. They do mention BSI and ISO rules, but focus on experts who feel they are too vague and not domain specific enough. They also mention Taylor’s naming of Wilson’s four ‘types’ of analysis in indexing: purposive, figure-ground, objective and unity or selection rejection, but again find them lacking.

They point out that the lack of expressed rules for the first step of ‘creating a notion’ leads to more difficulty in our ability to research or evaluate it compared to computer indexing. They proceed to discuss formulating rules for this step. They present experts arguments for and against ‘rule discovery’ versus ‘rule construction’ to meet the need for more rules. They advocate a combination approach of discovering the cognitive processes that our minds use and then viewing this in an environment in which we look for the social context and biases inherent in that view. They note that indexers are human that analyze in the context of their culture and personal experiences.

Discussion becomes fascinating at this point as it begins to cover user aims and the social purpose and context of the text (both by its creator and by the user who may want to access it.) How does the indexer approach a text where a dominant social institution created the text and is paying for its indexing and a marginalized group may also have interest in accessing that text using a different search criteria? Whose aims, goals, and intentions are being fulfilled in the social world this text is retrieved in?

They feel most rules for indexers do not focus on user ‘request-oriented indexing’ but that this should be a much larger factor in rules and rule making.  They discuss the domain analytic paradigm that marries the analysis to the knowledge domain of the user and how to best serve their needs.

They seem to sum up with Cooper, as cited by Frohmann, using his approach that that the indexer must scan the document and also keep his user population in mind, while also keeping in mind the many uses that might often be neglected, concluding ‘if a person is using this term is likely to want this document, then use the term.’ Then say that Cooper’s ideas never caught on in the indexing community. Maybe this leads directly into Part II, automatic indexing. That remains to be seen next week.

This essay was twenty three pages and delved into multiple aspects of indexing. I found myself thinking, aren’t they just talking about artificial intelligence versus human intelligence for indexing? Don’t we essentially codify ‘human rules’ when we take computer analysis beyond keyword searching of full text to subject strings, relevance, applying controlled vocabulary? I appreciated their view that constructing rules would force indexing to investigate the historic, economic, political and social context of the rules of its domain much like LCSH has done in classification.

Early on, in discussing human vs computer indexing, they state that ‘users find them on balance, more or less equally effective’ I question that statement. Since this is an essay and not a research paper, they have the right to that statement but I feel that it weakened their position immediately with anyone coming to the essay who doesn’t already agree. Without clearly documented proof, it takes all of us who don’t already believe that and makes the author’s opinions suspect.

They also state ‘the fact that IR databases that rely solely on automatic indexing have been economically successful means that the users who are paying for them… …find them sufficiently effective to justify the cost’ also felt like a weak argument, having read last week about the use of Enterprise databases beginning to hire and use human indexers in their need for more efficient retrieval, and finding it cost effective. I found numerous issues like this in the essay, but felt that their discussion was thought provoking and illustrated many opinions and approaches. I ultimately learned a lot from their analysis of indexing and their approach to indexing rules, the user and their broad view. I look forward to finding out more about human rules for analysis and indexing and if, as the result of writing rules for computers, we haven’t already advanced a lot in this area.

They address IR beginning to take more than one indexing approach in hope of maximizing retrieval, which seems so obvious to me that it speaks to this article being sixteen years old.

Addressing the issue of comparative research lacks, at the beginning of this essay broke the flow of my thought process and development of their idea, I would have preferred it at the end. However, if I had come to it with a broader background that might not be true.

Article Summary for Lecture #8 – Gross

Still a Lot to Lose: The Role of Controlled Vocabulary in Keyword Searching

This post summarizes and applies the above paper to LS500 studies, the Organization of Information.

As keyword searching has become comfortably ubiquitous, some in the LIS community feel that free text searching of bibliographic records should replace the use of controlled vocabulary cataloging of bibliographic records. They arguing that cost savings would be great and search effectiveness would be adequate. A study on keyword searching vs subject heading searching, written in 2005 by Gross and Taylor, refuted the idea of keyword searching adequacy and clearly demonstrated the value of a controlled vocabulary (specifically subject headings) in keyword searches.

This 2014 paper supports the conclusions of the 2005 study by Gross and Taylor and expands upon it, adding a literature review and exploring many of the criticisms leveled at the 2005 study. 2005 was twenty years ago, the internet was expanding and its capabilities were still being explored. The times were partially responsible for an over confidence in the abilities of keyword searching to lead to less work and more automation. Economics of information organization is a recurrent theme throughout LIS and it would have been an easy leap to make in the early 2000s.

The authors begin with a brief review of the 2005 study, which concluded that “if subject headings were to be removed from or no longer included in catalog records, users performing keyword searches would miss more than one third of the hits they currently retrieve.”

The authors repeat a variant of the 2005 study, including added TOCs and summaries/abstracts (essentially addressing the criticism that those items would compensate for lack of subject headings.) Even with these additions, they find that the keyword search misses 27% of hits, as compared to the 35.9% of hits lost in the 2005 study.  The addition of TOCs and summaries was only somewhat helpful. This article clearly illustrates that we serve Cutter’s second goal of collocation and Svenonious’ 5th bibliographic objective of navigation by the inclusion of a controlled vocabulary.

They detail two major criticisms that were leveled against the 2005 study. The first was that limiting the search to English may have dramatically underestimated the number of hits that would be lost in keyword searching, which they agree with.

The second criticism they address was that other elements, such as table of contents (TOCS) and summary notes, could be added that would increase the accuracy of the search enough to largely overcome the lack of subject headings. In their literature review, the authors note that the topic of subject headings vs keyword searching had been researched extensively for decades, but that in the last 20 years no one had had looked at the research with the purpose of determining if keyword search or controlled vocabulary was a more effective method for searching. The authors walk us through the main research articles on keyword vs subject headings and conclude that all but one, show a clear advantage of subject heading searches. One third or more of the words that make keyword searching hits available are the controlled vocabulary added by humans. The Principle of Least Effort discusses that users will end a search prior to fully finishing if it proves too hard. one way to guarantee an incomplete search is to not include subject headings.

A strong point repeated throughout the literature is that added subject headings often attach concepts or important terms to the bibliographic record that it wouldn’t otherwise contain. Humans add the subject headings and humans think like humans. Human catalogers are still ahead of automation on assigning appropriate information in line with what humans look for. Maybe someday that will change as AI advances but we aren’t there yet.

They also speak to the weakness of full text searching, including a discussion of Zipf’s Law, that basically states that for every 3 times a key word appears in a text, it appears with a different meaning. So 12 appearances = 4 different meanings, which muddies any full text search and also makes automation of attributes problematic.

The authors offer some suggestions for fuller search results such as adding folksonomies and user search terms to augment the record, creating tools for untrained users and automatically adding TOCs, summaries and other metadata. They don’t advocate these as sole solutions but in conjunction with the use of a controlled vocabulary. They note the complimentary potential of them.

The authors go onto a discussion of future research and applications, mentioning that emerging uses of faceted searching and relevance ranking was just beginning at the time of the article and that new linked data frameworks would also entail the need for the controlled vocabularies we use. They talked about the separate areas of metadata schemas and bibliographic control that are converging and that new interfaces could use many of the combined solutions noted above.

Their literature review uncovers many additional advantages to including controlled vocabulary such as grouping synonyms, variant spellings and word forms, providing references to and from obsolete terms, distinguishing among various meanings of the same term, providing hierarchical references and providing a searchable text for non-textual resources.

They then cite enterprise searching analysis that clearly demonstrated the cost effectiveness of adding subject metadata. Inadequate subject heading input simply transfers the lowered cost of cataloging to additional user cost in time and effort. The balance sheet totals the same, the only difference is ‘who pays the bill.’ Serving the user and making the system as easy to use and transparent as possible have been themes throughout LS500. A related disservice to users is that not fully cataloging information is effectively removing that information from the ability of many users to find it.

We have learned that our profession’s role is to mediate or teach the user to effectively use the retrieval tool. A tool is only as good as the material that is uses. A structured surrogate record must be detailed to be useful. Controlled vocabulary narrows and refines the search, and adds terms and relationships that help a user to pinpoint what they need. Keyword searching results in dilution and ambiguity, it doesn’t help the user in developing a search or finding information.

This paper touches on many points from both LS500 and LS501. The ties between classes and this article were too numerous to mention them all, so I chose a few. I was fascinated to get an in depth look at the strong case for controlled vocabulary since I did not really see the full weakness of keyword searching having lived in the internet universe for so long. I am now a believer. Apologies if this article is a little long. This paper was so lengthy that it was difficult to summarize without having to leave out exciting points. I almost shied away from it due to its length but enjoyed it immensely and feel I now have a broader understanding and better merging of LIS guiding principles and ethics.

Article Summary for Lecture #7 – Naun

Objectivity and Subject Access in the Print Library

Published in 2006, this article is a conceptual, not historical, examination of library science’s approach to subject representation as an inheritance of its original work in the print environment. It is an endorsement of that structure and framework as he challenges us to apply it to bring current technology and our social mission into harmony. By the end of this article the conclusion could be reached that by using our profession’s history of subject access guiding principles and partnering with technology that this is the first time in history that we can meet both the user’s need for their ‘first term to search’ and the need for ‘subject headings that reflect the most unbiased view at the time.’

The author begins with a look at librarianship’s organizing and provision of service functions. He delves into the print environment and how that has shaped our function in its economics, distribution and collection demands.

He quotes Cutter’s dictum that subject entry should be chosen that will be the ‘one the user first looks for.’ He goes onto elaborate that Cutter noted that more than one entry (one to meet the rule and one to meet the needs of the user) was not forbidden by the dictionary principle. That the multiple entry approach was limited due to the physical bulk and work load of a print catalog.

He speaks about our profession’s guiding principles for representing subject content. Just like the ‘bulk’ of a card catalog there is also the ‘bulk’ of limitless vocabulary. A controlled vocabulary is needed so the preferred terms can be found.

He discusses how current technology perform subject access. Terms are analyzed statistically on relative frequency as opposed to the ‘conscious conceptualization of subject content’ that the library profession espouses. This type of search increases the richness of subject content, but also results in dilution. He acknowledges that full text searching can mitigate bias on one hand then adds that natural language does have biases of its own.

Library methods of indexing attempt to compensate for bias by having consciously held values, principles and rules. Like full text searching, this also contains both the possibility of bias and the possibility of overcoming bias. The author concludes by reminding us that “impartiality does not demand infallibility so much as vigilance. The consensus on any subject can vary between communities, and change over time.”

How to construct objectivity is his next topic. He goes beyond Cutter and begins to discuss Hulme’s principle of literary warrant. This develops into a discussion of library’s relationship to human discourse and how predetermined class structures themselves are shaped by the society they develop from.

He develops that idea that librarian’s objectivity has more to do with arbitrating among different views than in adopting a single so-called neutral or objective view. That this can be misconstrued as liberalism, but it isn’t taking the ‘liberal’ view of a subject, it is advocating and ensuring equal access to all views of a subject. Subjects can address prevailing norms as well as competing ideas and should change with the time. Vocabularies must be able to serve all users across a group.

He speaks of the specific steps needed for impartiality, beyond not causing offense or giving equal time to sides. He states that the end result must be a search that finds whatever the collection has on the subject, without favoring a side or obscuring any relevant material.

I found this article very inspiring, since LS501 has so clearly demonstrated the inherent bias in most aspects of libraries and librarians. I was encouraged that the concept of objectivity is actively pursued. I was also encouraged that technology, usage and relevance rating could play a part in furthering the goal of objectivity, not just serve as a dilution factor. This has been my personal feeling and (perhaps because of my cognitive bias) it felt good to have an author clearly address how it could be harnessed to serve the higher good.