Access the Blue Item Bank

Structure of the Blue item bank

The item bank is structured as follows:

The first four columns of the item bank provide the division, category, subcategory and course type (if applicable) for each item.
The fifth column provides the item text, and the sixth column provides the response options.
The seventh column provides the tag (see below) for the item, if relevant.

Division/category/subcategory/course type

These columns identify when/where a particular item is used in an institution. In the Item Bank, the University of Toronto is used as an example.

Division

For institutional items this simply reads “Institution Wide” as these items apply across the university and all of its campuses
Otherwise, this column indicates the division in which the question is used
- Note that the campuses are considered as different divisions (as explained below).
For instructor-selected items, this is a unique letter and number for a given item. The letter corresponds to an item’s category/theme (see “instructor-selected items” below) and the number is sequential within a given category.

Subcategory

Where applicable, this indicates the department in which the question is asked (e.g., “Psychology”)
This may also indicate special circumstances in which the question is asked (e.g., “All except Chemical Engineering and Biomaterials and Biomedical Engineering”)
The first five institutional items are indicated here as making up the Institutional Composite Mean
For instructor-selected items, this indicates the basic category of the question (e.g., “Course Documents”)

Course type

Where applicable, this indicates if the question is asked for specific course types (e.g., “Project courses”, “300 & 400 level courses”; “course 369H”)

Cascade categories

Indicated by the categories, are three broad sets of items in the Blue Item Bank that correspond to the cascading levels of the University of Toronto Cascaded Framework.

These are items that are used on course evaluation forms that were developed through an intensive process of consultation, and overseen by a provostial working group at the University of Toronto. These are all denoted in the item bank by the ‘institutional’ label column.

These eight items include six quantitative rating scale items. The first five (sub-categorized institutional composite mean items 1-5) are averaged at the institution level to create a composite score termed the ‘institutional composite mean’. Factor analyses have shown that these five items form a reliable and stable measure of a single construct that measures students’ learning experiences at the University of Toronto (Centre for Teaching Support & Innovation, 2018).

The sixth rating scale item is an ‘overall’ item. This item often provides a similar result as the composite score, and has a similar intent, but can differ somewhat (see note on ‘overall’ items in tags below).

The seventh and eighth institutional items are qualitative questions.

Divisional/Departmental items

These items cascade to specific campuses/divisions or faculties/departments. The term “division” is used to refer to each of these heterogeneous superordinate units. The specific division is identified in the level column (e.g., Faculty of Arts and Sciences). The subsequent category and subcategory columns provide further levels of granularity where appropriate (e.g., Undergraduate, Graduate, Psychology, Anthropology, course type).

The items that fall into these categories were developed by local committees at/for each of the relevant levels (e.g., divisions or departments) of the institution during implementation with a particular group. The item selection process at the divisional/departmental level is meant to ensure that items reflect local contexts and interests.

Instructor-selected items

These items are used both to allow instructors to select from for their own course evaluations (these items are used formatively and their results are only made available to the instructor), in addition to being a definitive starting list to draw from for implementation at divisional/departmental levels to represent priorities identified by specific committees. Items are also periodically added here as they are developed with particular committees (as discussed above under ‘divisional/departmental items’). Instructor-selected items are denoted in the level column as “instructor-selected” and the category column provides a categorical “theme” for a given set of instructor-selected items.

These items were collected and developed centrally and were based on a comprehensive review of course evaluation instruments, and, created by committees as described above. Further qualitative and quantitative work for further validation/tagging information for these items is ongoing.

Items and scales

The text of the items is provided in the item column and the response options are provided in the scale column.

Note that items that include “the instructor” are intended to be populated with the instructor’s name and asked for each instructor in a course. For instance:

The instructor created a course atmosphere that was conducive to my learning
Should appear to students responding for particular instructors in Blue as:
The instructor [Firstname, Lastname] created a course atmosphere that was conducive to my learning.

It is important to note that in some cases, there are similarly worded or variant versions of given items for divisional/departmental items (both within that category and between the instructor-selected items) given the process by which they were created and implemented. Items are all unique, however, within the categories of institutional and instructor-selected items (with one exception where Y5 and X15 are duplicates and categorized into two separate subcategories.

Tag

When selecting items for use, it is important to consider each carefully in terms of its appropriateness for particular context(s) and/or intended use(s). These tags as indicated in the relevant column, are intended to support this process and provide further details for specific items where appropriate.

Attribute

Items tagged as “attribute” are items that ask students to rate instructors on qualities which may be difficult for students to rate objectively or clearly. This includes questions that ask about qualities such as “approachability” or “reasonableness”. Items such as these are believed to be more likely to produce unreliable results and/or produce potentially biased results. While research evidence on this topic is mixed, it is well-advised to consider the use and results of such items cautiously. In general, it is advised that these items should be used primarily for formative purposes rather than for summative purposes.

Context

Items tagged as “context” are items that ask students to respond to statements that provide contextual information. This includes items that ask students to rate their interest in a course, or their perceived workload in a course. Typically, these items should not be interpreted on their own and are best used to better understand other items. For instance, perceived workload or attendance, may help provide explanations on students’ responses on other items, but each alone is difficult to interpret for formative or summative purposes. These items are particularly useful for monitoring and analysis purposes (e.g., validation).

Forward-looking

Items tagged as “forward-looking” are items that ask students to evaluate their learning experiences more broadly. For instance, extending perceived learning beyond a particular class or the classroom in general, or, asking if their learning changed the way they think (e.g., inspiration, stimulating new ways of thinking). These items may be challenging for some students to accurately assess as they ask the student to think beyond their current experiences and/or their present frame of reference. Research has shown that learners frequently have difficulty accurately assessing their actual learning outcomes accurately, it is likely this is even more difficult for such broad outcomes. Thus, while these items are frequently requested, their use and interpretation should be regarded cautiously. It is advised that these items be used for primarily formative purposes only.

Frame of reference

Items tagged with “frame of reference” are items where the students’ frame of reference should be carefully considered prior to the use of these items. This is typically because there are terms which students may not always interpret accurately or similarly (e.g., critical reflection, advanced research). These items should only be used when it is clear that the terms in question are well-defined and prominent in a particular course. It is advised in implementation that pilot validation data be collected, asking the intended population of students who are responding to define the terms within the question to assess their understanding of these concepts.

Broad

Items tagged as “broad” are items that ask about broad learning experiences (often beginning with ‘overall, the quality of…’). Research has suggested that these items may be less reliable, and, perhaps more subject to bias. This is because what students consider individually when posed with such a question can differ radically, and, the open-interpretation may have students consider factors which are not fully (or at all) relevant to the evaluation of instruction (e.g., time course is offered, instructor attributes). Thus, although these are popular questions in course evaluations generally, these items should be used cautiously. While they ostensibly distill the complexity of a learning experience into one number, they can conceal important specifics and/or nuances that should be considered in the evaluation of teaching/learning. The results of these items should always be considered in terms of their correspondence to other, more specific and reliable items and/or with composite scores (such as with the institutional composite mean within this item bank).

Multi-part

These questions have multiple components. So-called double-barrelled questions are well-recognised to be troublesome as respondents may have difficulty rating an item if they would prefer to respond differently to what are effectively two questions. Within the item bank are questions that, depending on the context, may fit these criteria, and so they should be considered cautiously. This includes questions where are likely inter-connected, but this should be carefully considered on a case-by-case basis. These include sets of concepts (e.g., theory, practice, and research; opinions and experiences) or processes (e.g., formulate, analyze, and solve problems; think critically about the subject, develop new ideas, and think more broadly) that may sometimes be inextricable, but are not guaranteed to be. Thus, when considering the use of interpretation of these questions, it is important to consider whether these elements are likely to be adequately inter-linked for those responding to avoid unreliable responding due to the issues with double-barrelled questions as referenced above.

Expertise

These questions would require students to have adequate expertise in the subject matter being taught and/or pedagogy in order for them to properly and reliably answer the question. Examples of this are, “The course provided information on important issues in the subject matter,” and “The course instructor had reasonable overall learning expectations for students in the course.” These questions assume that students have enough knowledge of course material to be able to have an informed opinion about what is and is not important, or that students have adequate knowledge of teaching practices. Since students, on average, are unlikely to have this kind of knowledge, these questions are less likely to properly assess instructors’ knowledge and teaching skills, and rather simply show students’ perceptions. Thus, the likely level of knowledge and/or experience of the students should be carefully considered in selecting or interpreting these questions. In general, it is advised to use these items formatively.

Blue Item Bank implementation

The Blue Item Bank is a license controlled feature. Please contact your Explorance account manager for more information.

References

Centre for Teaching Support & Innovation. (2018). University of Toronto’s Cascaded Course Evaluation Framework: Validation Study of the Institutional Composite Mean (ICM). Toronto, ON: Centre for Teaching Support & Innovation, University of Toronto. Accessible at: https://teaching.utoronto.ca/wp-content/uploads/Validation-Study%5FCTSI-September-2018.pdf