Week 3 — Problem Identification & Formulation

Session at a Glance

Lecture Topic
Problem identification; from topic to RQ; hypotheses; conceptual frameworks; variables; FINER criteria

Lab Activity
RQ formulation workshop; One-page problem statement draft; Peer critique

Duration
2 hrs Lecture + 12 hrs Lab/Project

Milestone
Problem Statement Draft Due

Learning Objectives

By the end of this session, you will be able to:

Identify viable sources of research problems across literature, industry, and personal observation
Narrow a broad topic into a focused, answerable research question using systematic scoping techniques
Evaluate a research problem against the FINER criteria (Feasible, Interesting, Novel, Ethical, Relevant)
Formulate well-structured research questions and, where appropriate, testable hypotheses
Construct a conceptual framework identifying independent, dependent, moderating, mediating, and control variables
Produce a one-page problem statement suitable for supervisor feedback

Session Planner

Suggested breakdown of the 4-hour contact session.

Time	Segment	Activity	Mode
0:00–0:08	Opening	Recap Week 2 paradigms; connect paradigm choice to problem formulation	Whole class
0:08–0:30	Lecture 1	Sources of research problems; from topic → problem → RQ; the FINER criteria	Lecture
0:30–0:55	Lecture 2	Formulating RQs and hypotheses; conceptual frameworks; variable types; BBA + BCA examples	Lecture
0:55–1:10	Activity	RQ diagnosis: evaluate 6 RQs against FINER criteria; classify as good or problematic	Pairs
1:10–1:25	Discussion	Share evaluations; discuss what makes an RQ "researchable"	Whole class
1:25–1:40	Break	—	—
1:40–2:00	Lab Briefing	RQ formulation workshop instructions; problem statement template walkthrough; FINER self-check	Demo
2:00–3:30	Lab Work	Individual RQ refinement; one-page problem statement drafting; peer critique in pairs	Individual/Pairs
3:30–3:50	Discussion	Volunteers share problem statements; facilitator highlights strong examples	Whole class
3:50–4:00	Exit Ticket	Submit problem statement draft; self-assess against FINER	Individual

1. Where Do Research Problems Come From?

Students often say: "I don't know what to research." But research problems are everywhere — you just need to know where to look and how to recognize them. The best research problems sit at the intersection of what interests you, what matters to a community, and what hasn't been answered yet.

① Literature Gaps

Published papers end with "future research directions" or "limitations of this study." These are gold. A gap exists when: (a) a relationship hasn't been tested in a new context, (b) a method hasn't been applied to a problem, (c) findings conflict across studies, or (d) a population or technology is under-studied.

BBA: "Prior research studied CSR in large firms — what about SMEs?"
BCA: "Federated learning has been studied for image data — what about tabular healthcare data?"

② Industry & Practice Challenges

Real organizations face real problems that lack evidence-based answers. Industry conferences, practitioner journals, and your own internship experience are rich sources. The question: "Is this just one company's problem, or does it point to a broader gap in knowledge?"

BBA: "The company I interned at struggled with employee retention after going remote — is this a generalizable phenomenon?"
BCA: "Our team kept introducing bugs during CI/CD — what practices actually reduce defect injection rates?"

③ Personal Observation & Curiosity

Something you've noticed, experienced, or wondered about. These problems have the advantage of genuine motivation — you actually care about the answer. The challenge is moving from personal curiosity to a researchable question with broader significance.

BBA: "Why do my friends trust UPI for payments but not for investments?"
BCA: "Why does my app's battery drain spike on certain devices but not others?"

④ Supervisor Expertise

Your supervisor has spent years in a research area. They know which questions are timely, which methods are appropriate, and — crucially — which problems are too big, too small, or already solved. A supervisor-suggested topic is not "cheating" — it's leveraging expertise to avoid dead ends.

Ask your supervisor: "What are the open questions in your area that would be appropriate for an undergraduate capstone?"

The Sweet Spot

The best capstone problems are personally motivating (you'll sustain interest for 30 weeks), intellectually significant (it contributes to a literature), and practically feasible (data exists or can be collected; you have or can acquire the necessary skills). If a problem lacks any of these three, it will cause trouble later.

2. From Topic to Problem to Research Question — Narrowing the Scope

The most common mistake in capstone proposals is starting too broad. "I want to study digital marketing" is not a research problem — it's a domain. "I want to study artificial intelligence" is not a research problem — it's a field. You must progressively narrow until you reach a question that can be answered within 30 weeks with the resources available to an undergraduate.

2.1 The Narrowing Funnel

Step 1 — Domain (Too Broad)

"Digital Marketing" / "Machine Learning" / "Consumer Behaviour" / "Cybersecurity"

↓ Narrow by adding context

Step 2 — Focused Area

"Social media influencer marketing effectiveness among Gen Z consumers in Indian metros"
"Transformer model compression for low-resource Indian languages"

↓ Narrow by identifying a specific gap or tension

Step 3 — Research Problem

"While influencer marketing spend is growing rapidly in India, we don't know whether micro-influencers (10k–100k followers) or macro-influencers (100k+) generate higher engagement-to-conversion ratios for D2C brands — existing studies report conflicting findings and focus on Western markets."

↓ Narrow into an answerable question

Step 4 — Research Question

RQ: "How does influencer tier (micro vs. macro) affect engagement rate and purchase conversion among Gen Z consumers of Indian D2C beauty brands on Instagram?"

2.2 BCA Narrowing Example

Step 1 — Domain

"Natural Language Processing for Indian languages"

↓ Focus

Step 2 — Focused Area

"Sentiment analysis for Hindi-English code-mixed social media text"

↓ Identify gap

Step 3 — Research Problem

"Existing multilingual models (mBERT, XLM-R) perform poorly on Hindi-English code-mixed text because they are pre-trained on monolingual corpora. There is no systematic comparison of fine-tuning strategies specifically for code-mixed sentiment analysis, and no publicly available benchmark for Hindi-English code-mixed sentiment."

↓ Question

Step 4 — Research Question

RQ1: "How does fine-tuning strategy (full model vs. adapter-based vs. prefix-tuning) affect sentiment classification accuracy on Hindi-English code-mixed text?"
RQ2: "Does synthetic code-mixed data augmentation improve performance over fine-tuning on naturally occurring code-mixed data alone?"

3. The FINER Criteria — Is Your Problem Worth Pursuing?

Hulley et al. (2007) proposed the FINER framework for evaluating research problems. Before committing 30 weeks to a problem, run it through these five filters:

Criterion	Meaning	Questions to Ask	Red Flags
F — Feasible	Can you actually do it within 30 weeks with undergraduate resources?	Do you have access to data/participants? Do you have the necessary skills (or can you learn them in time)? Is the scope manageable? Can you get ethical clearance if needed?	Data requires corporate partnerships you don't have; needs 2 years of fieldwork; requires a ₹50 lakh lab; needs 10,000 survey respondents with no budget
I — Interesting	Will you sustain motivation for 30 weeks? Will your supervisor, examiners, and peers find it engaging?	Do you genuinely care about the answer? Is the question intellectually stimulating — not just fact-gathering? Will you still want to work on this in Week 25?	"I'm doing this because it's easy" / "My supervisor told me to" (with no personal investment) / the question has an obvious answer
N — Novel	Does it contribute something new — even modestly — to existing knowledge?	Has someone already answered this exact question? If yes, what's different about your approach (context, method, population, time period)? Can you articulate the gap in one sentence?	A pure replication with no new context; a question that a Google search answers; "I'm going to prove what everyone already knows"
E — Ethical	Can the research be conducted without harming participants, violating privacy, or breaching integrity?	Does your study involve vulnerable populations? Will you need informed consent? Are there data privacy concerns (GDPR, DPDP Act 2023)? Could your findings be misused?	Deceptive experiments without debriefing; collecting personal data without consent; studying children/minors without special protocols; security research that could enable attacks
R — Relevant	Does the answer matter — to the literature, to practice, to society?	Who will care about your findings? What decision could your research inform? What problem does it help solve? Is the relevance beyond your personal interest?	"I'm curious" (with no broader relevance); studying a phenomenon that no longer exists; research that can't inform any decision or practice

FINER in Practice

Most capstone problems fail on Feasibility (too ambitious) or Novelty (not a real gap). The art is finding a problem that is novel enough to contribute but feasible enough to complete. Your supervisor is your best guide on this trade-off — they've seen what works and what doesn't.

4. Formulating Research Questions and Hypotheses

4.1 Research Questions — The Engine of Your Capstone

A well-formed research question (RQ) is clear, focused, complex, and answerable. It is not a topic, not a yes/no question, not a statement of intent. It is the question your entire dissertation will answer. Most capstones have 1–3 RQs (not 10 — focus is a virtue).

RQ Type	Purpose	Starter Phrase	BBA Example	BCA Example
Descriptive	Describes a phenomenon	"What are the..." "How prevalent is..."	"What are the primary barriers to digital payment adoption among street vendors in Mumbai?"	"What types of technical debt are most commonly reported in open-source microservice projects?"
Relational / Comparative	Examines relationships or differences	"How does X relate to Y?" "To what extent does A differ from B?"	"How does brand activism perception relate to purchase intention among Gen Z consumers in India?"	"How does container runtime (Docker vs. containerd vs. gVisor) affect cold-start latency in serverless functions?"
Causal	Tests cause-effect	"What is the effect of X on Y?" "Does X cause Y?"	"What is the effect of flexible work arrangements on employee retention in Indian IT SMEs?"	"What is the effect of code review checklist usage on defect detection rate in agile teams?"
Design / Construction	Creates and evaluates an artefact	"How can we design a... that...?" "Can a... improve...?"	"How can we design a predictive model for early-stage startup failure in the Indian fintech sector, and how accurately does it perform?"	"Can a lightweight transformer-based model achieve comparable accuracy to large models for code-mixed sentiment analysis while reducing inference latency by 50%?"

4.2 Hypotheses — When and How to Use Them

Hypotheses are formal, testable predictions derived from theory — they belong primarily to the positivist paradigm. If you're doing interpretivist, pragmatist, or design science research, you may not need hypotheses at all (propositions or design goals may be more appropriate).

Hypothesis

A hypothesis is a specific, testable prediction about the relationship between two or more variables. It is stated in a form that allows it to be rejected (falsified) by empirical evidence. A hypothesis is never "proved" — it is supported or not supported by the data.

Element	Explanation	Example
Null Hypothesis (H₀)	States there is NO relationship/effect/difference. This is what you test statistically.	"There is no significant difference in purchase conversion between micro-influencer and macro-influencer campaigns."
Alternative Hypothesis (H₁)	States there IS a relationship/effect/difference. This is what you believe or hope to find.	"Micro-influencer campaigns generate significantly higher purchase conversion than macro-influencer campaigns."
Directional	Specifies the direction of the relationship (higher/lower/more/less).	"Micro-influencers generate higher engagement than macro-influencers."
Non-directional	Predicts a difference but not its direction. Used when literature is mixed or exploratory.	"There is a significant difference in engagement between micro- and macro-influencers."

RQ or Hypothesis — Which One?

If your paradigm is positivist and you have strong theory predicting a relationship → use hypotheses. If your paradigm is interpretivist → use RQs only (hypotheses make no sense when you're exploring meanings). If pragmatist → RQs for the qualitative strand, hypotheses (optional) for the quantitative strand. If DSR → design goals or evaluation questions, not hypotheses in the statistical sense.

5. Conceptual Frameworks and Theoretical Grounding

A conceptual framework is a visual or narrative representation of the key concepts in your study and the relationships you expect to find among them. It shows what you're studying, what you think is connected to what, and — crucially — why (grounded in theory). It is the bridge between your literature review and your methodology.

Conceptual Framework

A system of concepts, assumptions, expectations, beliefs, and theories that supports and informs your research. It explains — graphically or in narrative form — the main things to be studied: the key factors, constructs, or variables, and the presumed relationships among them.

5.1 Variables — The Building Blocks of Frameworks

Variable Type	Role	BBA Example	BCA Example
Independent (IV)	The cause / predictor / what you manipulate	Influencer tier (micro vs. macro)	Quantization method (dynamic vs. static vs. QAT)
Dependent (DV)	The effect / outcome / what you measure	Purchase conversion rate	Model accuracy (F1-score)
Moderating (MV)	Changes the strength or direction of the IV→DV relationship	Product category (beauty vs. fashion vs. electronics) — influencer effect may differ by category	Dataset size — quantization effect may differ for small vs. large datasets
Mediating (MedV)	Explains the mechanism — WHY the IV affects the DV	Perceived authenticity — micro-influencers → higher authenticity → higher conversion	Representation capacity loss — quantization → loss of representational precision → lower accuracy
Control (CV)	Held constant to isolate the IV→DV relationship	Brand familiarity, post frequency, time of posting, follower count	Hardware platform, batch size, input sequence length, random seed

5.2 Example Conceptual Framework (BBA)

IV
Influencer Tier
(Micro vs. Macro)

→

MedV
Perceived Authenticity

→

DV
Purchase Conversion

Moderator: Product Category

Controls: Brand Familiarity, Post Frequency, Follower Count

Theoretical grounding: Source Credibility Theory (Hovland et al.) + Meaning Transfer Model (McCracken)

5.3 Example Conceptual Framework (BCA)

IV
Fine-tuning Strategy
(Full / Adapter / Prefix)

→

MedV
Parameter Efficiency
(Trainable params / total)

→

DV
Sentiment F1-Score

Moderator: Code-mixing Density (% of English tokens)

Controls: Base Model (XLM-R), Seed, Batch Size, Epochs

Theoretical grounding: Lottery Ticket Hypothesis (Frankle & Carbin) + Parameter-Efficient Transfer Learning (Houlsby et al.)

Framework Without Theory is Just a Diagram

A conceptual framework must be grounded in theory. Don't just draw boxes and arrows because they look good — each arrow should represent a relationship that prior literature suggests exists. For every path in your framework, you should be able to answer: "Which theory or prior study supports this relationship?" If you can't, the arrow doesn't belong.

6. Same Structure, Different Discipline — A Dual Illustration

The structure of a good research problem is discipline-independent. Whether you study brand loyalty (BBA) or software adoption (BCA), the logical architecture is the same. What changes is the phenomenon, the data, and the method — not the underlying logic of inquiry.

Element	BBA — Brand Loyalty	BCA — Software Adoption
Broad domain	Consumer behaviour	Software engineering / HCI
Problem	Indian D2C brands invest heavily in loyalty programs, but it's unclear which program features actually drive repeat purchase behaviour among Gen Z consumers.	Organizations invest in static analysis tools, but adoption by developers remains low despite evidence that these tools find real bugs — it's unclear which factors predict sustained tool usage.
RQ	"What factors influence brand loyalty program engagement and repeat purchase behaviour among Gen Z consumers of Indian D2C brands?"	"What factors influence sustained adoption of static analysis tools among professional software developers?"
Paradigm	Positivist (survey + statistical modelling)	Pragmatist (survey + follow-up interviews with developers who adopted AND abandoned tools)
IV(s)	Reward type, personalization, community features, redemption ease	Tool usability, result actionability, integration with IDE, organizational mandate
DV	Repeat purchase frequency	Tool usage frequency and duration
Moderator	Product category involvement (high vs. low)	Developer experience level (junior vs. senior)
Theory base	Expectation-Confirmation Theory (Oliver); Technology Acceptance Model (Davis)	Technology Acceptance Model (Davis); Unified Theory of Acceptance and Use of Technology (Venkatesh et al.)

The Transferable Skill

The ability to take a vague topic and craft it into a clear, researchable problem with defined variables, a theoretical base, and a justified paradigm is the most valuable skill this course teaches. It transfers to consulting (diagnosing client problems), product management (defining what to measure), data science (framing analysis questions), and academic research. The phenomenon changes; the thinking doesn't.

Think Deeper — Cross Questions

Discuss in pairs before sharing with the class.

CQ 1

A student says: "My research question is: 'How can Indian startups succeed?'" What is wrong with this as an RQ? Apply the narrowing funnel and rewrite it into a focused, answerable research question.

CQ 2

Apply the FINER criteria to a research problem you're considering. Which criterion is most at risk for your topic? What could you change — scope, method, data source — to strengthen the weakest criterion?

CQ 3

A BCA student's conceptual framework has 8 independent variables, 3 mediating variables, and 4 dependent variables. Is this a problem? What principle of good research design is being violated, and how would you advise this student?

CQ 4

"My research doesn't need theory — I'm just describing what's happening." Do you agree with this statement? When might descriptive research still require a conceptual framework? What does theory add even when you're not testing causal hypotheses?

Quick Check — RQ Diagnosis

Each "RQ" below has a problem. Diagnose it and select the most accurate description of the issue.

1. "This research aims to study the impact of digital transformation on business performance."

2. "Do Instagram ads work better than Facebook ads for Indian D2C brands?"

3. "What is the effect of 4-bit weight-only quantization versus 8-bit weight-and-activation quantization on BERT-base's GLUE benchmark scores, inference latency on NVIDIA T4 GPUs, and model size in megabytes?"

4. "How do first-generation women entrepreneurs in Tier-2 Indian cities experience and navigate gender-based challenges while building ventures, and what strategies do they develop to sustain their businesses?"

5. "To investigate the relationship between employee engagement and productivity in the manufacturing sector."

6. "H₁: Micro-influencers generate significantly higher Instagram engagement rates than macro-influencers for Indian D2C beauty brands. H₂: This effect is moderated by content type (tutorial vs. testimonial vs. unboxing), such that the micro-influencer advantage is strongest for testimonial content."

Knowledge Check — Interactive Quiz

Test your understanding of problem formulation concepts.

Lab Activity — RQ Formulation Workshop & Problem Statement Draft

Part A: RQ Formulation Workshop (60 min)

Start with your tentative topic from Week 2's Topic Negotiation Worksheet.
Apply the narrowing funnel: Write your topic at each level — Domain → Focused Area → Research Problem → Research Question. You should have four written statements of increasing specificity.
FINER self-check: Rate your RQ on each FINER criterion (1–5). Identify the weakest criterion and write one sentence about how you'll strengthen it.
Peer critique: Exchange RQs with a partner. Each partner gives feedback: Is the RQ clear? Is it answerable in 30 weeks? Are the key concepts defined? What's missing?
Revise: Incorporate peer feedback and produce a revised RQ.

Part B: One-Page Problem Statement Draft

Using the template below, draft your one-page problem statement. This will be submitted to your supervisor and forms the foundation of your proposal (Week 8).

Problem Statement Template

1. Working Title (descriptive, not clever — max 15 words)

2. Background (3–5 sentences: What is the broader domain? Why does it matter? What do we already know? What don't we know? Cite at least 2–3 key references that establish the context.)

3. Problem Statement (3–4 sentences: What specific gap, tension, or unanswered question exists in the literature or practice? Why is this gap significant? What is the consequence of NOT addressing it?)

4. Research Question(s) (1–3 focused, answerable RQs. For positivist studies, include hypotheses below the RQs.)

5. Proposed Paradigm & Method (2–3 sentences: Which paradigm? Which research strategy — survey, experiment, case study, DSR, etc.? Why is this appropriate for your RQ?)

6. Expected Contribution (2–3 sentences: Who will benefit from your findings? What decision, practice, or understanding could your research inform?)

7. Key References (5–8 references that establish the gap and ground your study. Use APA 7th for BBA, IEEE for BCA.)

Exit Ticket

Submit with your problem statement draft.

State your refined RQ in one sentence.
FINER self-rating: Score your RQ on each criterion (F __ / I __ / N __ / E __ / R __). Which is weakest?
Identify your IV and DV (if applicable). If your study is not variable-based (interpretivist / DSR), identify your phenomenon of interest and how you'll study it.
One concern you have about the feasibility of your topic:
One question you'll ask your supervisor at your first meeting:

Key Takeaways — Week 3

Narrow, Narrow, Narrow

The #1 problem in capstone proposals is excessive breadth. Use the narrowing funnel: Domain → Focused Area → Research Problem → RQ. If your RQ can't be answered in 30 weeks, it's still too broad.

FINER is Your Filter

Run every RQ through FINER. Feasibility and Novelty are where most proposals fail. A feasible but modest novel contribution beats an ambitious but impossible one every time.

Variables Tell the Story

IV = what you manipulate/predict with. DV = what you measure. Moderator = what changes the relationship. Mediator = why the relationship exists. Controls = what you hold constant. If you can't identify these, your RQ isn't ready.

Theory Grounds Your Framework

A conceptual framework without theory is just boxes and arrows. Every relationship in your framework must be supported by prior literature. "I think these are related" is not a justification.

Facilitator Notes

Preparation Checklist

Print or share the Problem Statement Template for each student.
Prepare 3 worked examples of narrowing funnels — one BBA, one BCA, one cross-disciplinary — to walk through in Lecture 1.
Prepare 3 examples of good vs. problematic RQs for the RQ diagnosis exercise (in addition to the 6 in the quick check).
Have the FINER self-check rubric ready as a handout or digital form.
Coordinate with supervisors: they should expect problem statement drafts from their allocated students by end of this week.
For BCA students: prepare additional examples of DSR problem statements (which look different from positivist problem statements — problem-motivation rather than gap-hypothesis).

Common Student Difficulties

Confusing a topic with an RQ: "I want to study fintech" or "My RQ is about blockchain." Keep pushing: "What about fintech? What question about blockchain?" The narrowing funnel exercise is the antidote.
Making RQs that are yes/no questions: "Does CSR affect profitability?" can be answered with one word. Encourage RQs that ask "how," "to what extent," "under what conditions" — questions that require analysis, not just a verdict.
Too many variables: Students often propose frameworks with 6+ IVs, 3 mediators, and 4 DVs. Teach the principle of parsimony: model the most important relationships, not every possible one.
Theory anxiety: Students may not know which theory to use. This is normal at Week 3. Direct them to their supervisor and to review papers in their area (look at which theories those papers use).
BCA students not identifying variables: In DSR, the "variables" may be design parameters rather than IVs/DVs in the traditional sense. Help them map DSR concepts (design goals, evaluation criteria) to the variable framework.

Pacing Tips

The narrowing funnel is the most important concept this week — give it sufficient time. Walk through both the BBA and BCA examples slowly.
The RQ diagnosis exercise works best when pairs discuss before the whole-class debrief. Give 10–12 min for pair discussion.
If lab time is tight, prioritize the Problem Statement Draft over the full RQ formulation workshop. The draft is the milestone deliverable.
For mixed BBA/BCA cohorts: pair cross-discipline students for peer critique — it forces them to explain their RQs to someone outside their domain, which reveals vagueness.