Blog about short term memory research

    Confusion reigns in testing for Alzheimer’s disease

    Eugen G Tarnow  May 24 2013 09:11:08 AM
    Confusion reigns in testing for Alzheimer’s disease (AD) due to ad hoc developments of tests, disagreement as to what constitutes a diagnosis of AD, and the lack of inter-disciplinary discourse.  

    Thus in a recent review article on screening methods for memory disorders, Ashford (2008) discusses the historically ad hoc development of cognitive tests for Alzheimer’s Disease (AD).  The most commonly used test, the Mini Mental State Exam (MMSE, Folstein et al, 1975), was developed in an evening (Folstein 1990).  The Selective Reminding Test (Buschke, 1973) was constructed to test not only short term memory but long term memory.  The rationale behind the test may be flawed since short term memory may last as long as 15 minutes (Tarnow, 2008 using data from Rubin et al, 1999).  If this is indeed the case, the test joins the MMSE in being effective but ad hoc.  A third test, the FCSRT was patented by Buschke (1999) to calculate a weighted average of the probability of recall in which the weights change with presented item order.  His method can be characterized as ad hoc in that he did not use a well defined method to arrive at the weights but looked at the curves and guessed.

    Moreover Alzheimer’s disease (AD) is a controversial disease (Whitehouse & George, 2008). A recent NIH consensus statement (Daviglus et al, 2010) declared that “highly reliable consensus-based diagnostic criteria for cognitive decline, mild cognitive impairment, and Alzheimer’s disease are lacking” and Beach et al (2012) found that 40% of patients not diagnosed with AD were found to have AD at autopsy and 17%-30% of those diagnosed with AD did not have it at autopsy.  Thus when tests for AD are developed, the AD diagnosis is unreliable which, of course, makes the tests unreliable as well.  The Blessed Information-Memory-Concentration (Blessed et al, 1968) is the only test that has autopsy validity – it is correlated with the neurofibrillary tangle counts of dementia patients that are not severely demented.

    Finally, the communication in the AD field is non-optimal.  Data sharing is hampered by the perceived profit potential (Koslow, 2002).  Competing research groups do not cite each other (I will not embarrass the authors by declaring names).  AD is by its complexity a multidisciplinary effort but efforts in one field sometimes is ignored by another.  For example, neurology seems to ignore psychology: a consensus in neurology (McKhann et al, 2011) chooses not to include specific memory information from neuropsyhology that is known to lead to an initial diagnosis of AD and  Wright & Zonderman (2013) write that “inexpensive and noninvasive screening measures with both good sensitivity and specificity are needed” even though groups associated with both Ashford and Grober claim to have such tests for many years now and Ashford provides a list of 20 manual tests and 18 computerized tests.

      Short term memory computer models

      Eugen G Tarnow  March 10 2013 07:26:07 AM
      I received a negative referee comment stating that in contrast to my statement - that my theory explained 80% of the variance of the initial recall distribution without using interference or other context-based theories - "extant theories of free recall that invoke contextual-change and interference-based forgetting the Temporal Context Model (TCM; Howard & Kahana, 2002), the Context-Maintenance and Retrieval (CMR; Polyn, Norman, & Kahana, 2009) model, and the Serial-Free Recall model (Farrell, 2012) provide excellent descriptive accounts of the total recall and first recall probability serial position curves."

      Models that capture the essence of a natural phenomenon can be extraordinarily useful.  An example is Pauling's chemical bond model.  One does not even need a computer to apply it!  Other times models can be of very little use such as tight-binding models of semiconductors: after the fact they can explain everything but they can predict very little of value before we know the answers.  These models can interpolate some information from small changes in configurations but they can extrapolation nothing at all. (I used to do simulations of the electronic structure of semiconductors so I know).  

      Another example are the stock market models that led to the financial crisis of 2008 - one had neglected to add the possibility of housing prices going down.  A third example is the unbelievably famous, but incorrect, model of short term memory by Atkinson & Shiffrin (see ).  Here the model authors had the data they wanted to explain, constructed a model that described the data but then failed to see what it would predict about the same data.  But that did not prevent the authors from convincing everyone that short term memory has two components.  Fitting a model to known data does not make the model "explain" anything at all, all it does is fits the data.  The Pauling model works because it goes beyond fitting: it captures the essence of chemical interactions.

      This is my gut feeling about computer models of short term memory in general.  In semiconductors we at least know what the equations are that need to be solved.  In neuroscience we know pretty much nothing except for the equations concerning isolated extremely simple neurons.  

      Nevertheless, can't computer models be useful at all?  Yes, if they are controlled and one is looking to understand particular aspects of memory.  For example, if a model is fit to a particular set of data points and one can then show that another, independent, data set is well described as well.  Then the theoretical aspects of the computer model can be important.  But are they useful if they have a lot of different parameters that are fit to the same data it is trying to explain?  Absolutely not.  Then the computer model is simply functioning as a way to interpolate data, not a way to understand it.

      The Context Maintenance & Retrieval model is based not on neurons but on concepts such as "context" which is maintained by "search lights" probing the context.  These "search lights" probe memory items nearby in time and other characteristics.  It forms a "competition where all of the items compete in parallel to have their features reinstated in the system".  The CMR model can also be described as an "iterative parallel process, where the result of each recall competition affects the course of the subsequent competition."

      Is the CMR model an activation model?  It would seem so.  The authors define "context as a pattern of activity in the cognitive system, separate from the pattern immediately evoked by the perception of a studied item, that changes over time and is associated with other coactive patterns" ... "the notion that the elements of context are activated by some stimulus or event, tend to stay active past the time this stimulus leaves the environment."  Yet there is no discussion of deactivation which would presumably occur as time passes.

      Each memory item is associated with a vector f, each context with a vector c and the interaction between them via two matrices; "a given element in an associative matrix describes the connection strength between a particular feature element, and a particular context element."  In addition to the matrices there are variables that determine how quickly c is updated by f via a matrix and how quickly f is updated by c via a matrix.

      The authors use 11-14 parameters that are fitted to 93 data points of either the Murdock (1962) dataset or the Murdock & Okada (1970) dataset.  When there are so many parameters to fit, there are typically many sets of those parameters that give similar fits.  The authors use a "genetic algorithm-fitting" technique to deal with that problem.  The 93 fitted data points are not exhaustive nor picked independently.  For example, from the probability of first recall only the final three serial positions are used. Selecting which data points to use amounts to effectively adding parameters to the model.

      Going back to the initial free recall distribution, in order to describe it the authors first includes it in the data that the large number of parameters are fit to.  Then they show that there is a good fit for the complete serial position curve for the initial free recall.  Does this provide an "excellent" descriptive account?  Surely not.  In the end they are in fact unhappy with the description of the initial free recalls and suggest that the problem is some form of rehearsal that was not taken into account.

      When the journal asked if the referee wanted to disclose her/his name the referee declined.  Since this was the PLoS journal, which is unbelievably slow in responding, I decided not to argue my case.

      How short term memory works

      Eugen G Tarnow  August 3 2011 10:35:11 AM
      1.  It takes 0.5 to 3 seconds to put a word in your short term memory by looking at it or listening to it.  This "activation" time scale roughly corresponds to the time scale of synaptic exocytosis, i.e. the time it takes to have all the messenger molecules released from the "presynaptic" side of the synapse.  The activation time is longer for items that are complex than for items that are simple.

      2.  The activation time is longer if the word is not presented to you at the time and there is no obvious path to the memory.  For example, free recall, in which a list of words are presented and then one tries to remember them uses up to 8 seconds to reactivate a word.

      3.  It takes 3 seconds to 15 minutes to lose a short term memory.  It "decays" relatively slowly with time (logarithmically).  The time scale and functional behavior is roughly the same as for synaptic endocytosis, i.e. the time it takes for the presynaptic side of a synapse to heal back up and recreate the containers with the messenger molecules.

      4.  Full activation in short term memory typically extends to three items at a time, this is referred to as "working memory".  You might think of these three reactivation paths as corresponding to the past, present and future - a minimal set that allows us to learn from the past and project into the future.


      Temporal Context Model

      Eugen G Tarnow  March 30 2011 05:26:38 PM
      I got the following request from a referee rejecting my paper (presumably the referee was Marc Howard):

      "The model really needs to be compared against other models of free recall rather than SOB which is a model of serial recall. So, the obvious candidate is the Temporal Context Model (Howard and Kahana). In particular, I can't see how this model is going to be able to handle conditional response curves which are key data for free recall."

      I took a little time to address this issue and wrote:

      Kahana (1996) found that the chance that a recalled item is followed by the next item in the list is higher than that it is followed by a previous item in the list.  He constructed “conditional recall probability” (CRP) graphs that he claimed could not be reproduced with the then current computer models.

      This was followed by the Temporal Context Model (Howard & Kahana, 2001).  It argues that the u-shaped free recall curve is really not u-shaped.  To minimize the primacy effect and make the u-shape a j-shape they add a distracting task at each item presentation: deciding whether a word is concrete or abstract.  The primacy effect is still there for the first list item and Howard & Kahana (2001) concede that “the model has no mechanism to generate primacy, it fails to capture the small one-position primacy effect in the data.”  .  

      The CRP graphs can be accommodated in the Dynamic Tagging theory as a result of chunking, i.e. the subject associates two items together and makes one memory out of them.  Howard & Kahana (201) though write that “it is unlikely that the lag-recency effect is a consequence of direct interitem associations” because there seems to be what they term time scale invariance, i.e. the CRP graphs look the same even if the interitem interval is changed from 2 to 16 seconds.  The evidence for this invariance is limited to Fig. 2 in Howard & Kahana (2002) in which the interitem interval is varied from 2 to 16 seconds.  Unfortunately, the data in the figure has error bars much too large to prove their point.  The data behind the Dynamic Tagging model (Tarnow, 2008) shows the slow decay of short term memory on a logarithmic time scale which makes this point even harder to prove.  

      But the main weakness of the Temporal Context Model is that it was created purely to fit the CRP curves using abstract concepts without any regard to the underlying biology.  It has choices, for example, “ff one prefers the response–suppression approach, then TCM can be used to generate a set of activations, which can in turn be modulated by response suppression” (Howard & Kahana, 2001).  It can be anything to anyone which also means it has no predictive power and carries no information.

      Do you agree or disagree?  Feel free to post a comment!

      Response probability and response time: a straight line, the Tagging/Retagging interpretation of short term memory, an operational definition of meaningfulness and short term memory time decay and search time

      Eugen G Tarnow  January 2 2011 09:22:51 PM
      Response probability and response time: a straight line, the Tagging/Retagging interpretation of short term memory, an operational definition of meaningfulness and short term memory time decay and search time
      Cognitive neurodynamics, 2008, 2(4), 347-353.

      The functional relationship between correct response probability and response time is investigated in data sets from Rubin, Hinton and Wenzel, J Exp Psychol Learn Mem Cogn 25:1161–1176, 1999 and Anderson, J Exp Psychol [Hum Learn] 7:326–343, 1981. The two measures are linearly related through stimulus presentation lags from 0 to 594 s in the former experiment and for repeated learning of words in the latter. The Tagging/Retagging interpretation of short term memory is introduced to explain this linear relationship. At stimulus presentation the words are tagged. This tagging level drops slowly with time. When a probe word is reintroduced the tagging level has to increase for the word to be properly identified leading to a delay in response time. The tagging time is related to the meaningfulness of the words used—the more meaningful the word the longer the tagging time. After stimulus presentation the tagging level drops in a logarithmic fashion to 50% after 10 s and to 20% after 240 s. The incorrect recall and recognition times saturate in the Rubin et al. data set (they are not linear for large time lags), suggesting a limited time to search the short term memory structure: the search time for recall of unusual words is 1.7 s. For recognition of nonsense words the corresponding time is about 0.4 s, similar to the 0.243 s found in Cavanagh (1972).

      Why The Atkinson-Shiffrin Model Was Wrong From The Beginning

      Eugen G Tarnow  January 2 2011 09:14:12 PM
      Why The Atkinson-Shiffrin Model Was Wrong From The Beginning

      WebmedCentral NEUROLOGY 2010;1(10):WMC001021

      The Atkinson-Shiffrin (1968) model, a standard model of short term memory cited over three thousand times, mimics the characteristic shape of the free recall curves from Murdock (1962). However, I note that it is not a theoretically coherent explanation and that it does not fit any other relationships present in the same Murdock data. As a result, future theorists are challenged with defining the buffer concept properly, with defining the long term store properly, and with correctly predicting new relationships found in the Murdock data that directly probe various theoretical concepts.

      Short term memory may be the depletion of the readily releasable pool of presynaptic neurotransmitter vesicles of a metastable long term memory trace pattern

      Eugen G Tarnow  January 2 2011 09:12:06 PM
      Short term memory may be the depletion of the readily releasable pool of presynaptic neurotransmitter vesicles of a metastable long term memory trace pattern  
      Cognitive Neurodynamics, 2009, 3(3), 263-9.

      The Tagging/Retagging model of short term memory was introduced earlier (Tarnow in Cogn Neurodyn 10 2(4):347–353, 2008) to explain the linear relationship between response time and correct response probability for word recall and recognition: At the initial stimulus presentation words tag the corresponding long term memory locations. The tagging process is linear in time and takes about one second to reach a tagging level of 100%. After stimulus presentation the tagging level decays logarithmically with time to 50% after 14 s and to 20% after 220 s. If a probe word is reintroduced the tagging level has to go back to 100% for the word to be properly identified, which leads to a delay in response time. This delay is proportional to the tagging loss. The tagging level is directly related to the probability of correct word recall and recognition. Evidence presented suggests that the tagging level is the level of depletion of the Readily Releasable Pool (RRP) of neurotransmitter vesicles at presynaptic terminals. The evidence includes the initial linear relationship between tagging level and time as well as the subsequent logarithmic decay of the tagging level. The activation of a short term memory may thus be the depletion of RRP (exocytosis) and short term memory decay may be the ensuing recycling of the neurotransmitter vesicles (endocytosis). The pattern of depleted presynaptic terminals corresponds to the long term memory trace.

      There Is No Limited Capacity Memory Buffer in the Murdock (1962) Free Recall Data

      Eugen G Tarnow  January 2 2011 09:11:10 PM
      There Is No Limited Capacity Memory Buffer in the Murdock (1962) Free Recall Data
      Cognitive Neurodynamics 2010, 4(4), 395.

      Theories of short term memory often include a limited capacity ‘‘buffer’’. Such a buffer contains items which do not decay at all but are overwritten by new data. I show that one of the experiments that fueled the buffer concept, the free recall experiments by Murdock (J Exp Psychol 64(5):482–488, 1962), does not contain such a buffer.

      Short term memory bowing effect is consistent with presentation rate dependent decay

      Eugen G Tarnow  January 2 2011 09:09:13 PM
      Short term memory bowing effect is consistent with presentation rate dependent decay
      Cognitive Neurodynamics 2010, 4(4), 367.

      I reanalyze the free recall data of Murdock, J Exp Psychol 64(5):482–488 (1962) and Murdock and Okada, J Verbal Learn and Verbal Behav 86:263–267 (1970) which show the famous bowing effect in which initial and recent items are recalled better than intermediate items (primacy and recency effects). Recent item recall probabilities follow a logarithmic decay with time of recall consistent with the tagging/retagging theory. The slope of
      the decay increases with increasing presentation rate. The initial items, with an effectively low presentation rate, decay with the slowest logarithmic slope, explaining the primacy effect. The finding that presentation rate limits the duration of short term memory suggests a basis for memory
      loss in busy adults, for the importance of slow music practice, for long term memory deficiencies for people with attention deficits who may be artificially increasing the presentation rates of their surroundings. A well-defined,quantitative measure of the primacy effect is introduced.

      Rehearsal - it is there to make everybody scurry around.

      Eugen G Tarnow  January 26 2010 09:48:43 PM
      If one goes to basement in the New York Hall of Science and studies the visitors who make soap bubbles once finds that they invariably take the large metal loops and shake them in the water.  In the exhibit there is a sign that tells them not to do it.  I have no doubt that the shaking does nothing to enhance the bubbles yet we all have a need to do it.

      Similarly, memory psychologists have a need to believe in "rehearsal".  It is a conscious, unconscious process that is always there but can only be studied if the subjects are forced to do it.  Confused?  Me too.

      Researchers argue that rehearsal cannot be turned off and they argue like Atkinson & Shiffrin that you believe in it once you see it.  Jonides et al write that "rehearsal most likely reflects a complex strategy rather than a primitive STM process" - can't make up their minds.  They also write "rehearsal is often implicitly assumed as a component of active maintenance, but formal considerations of STM typically take the opposite view" - sounds like nobody can make up their minds.  This is a mess reminiscent of Atkinson & Shiffrin's failure to properly define rehearsal and ending up with a theory which is not a theory.

      Researchers like McElree and Jonides and others study rehearsal by forcing the subjects to rehearse.  It seems to me that one might be able to make the statement that rehearsal is assumed to be there, never shown to be there directly other than by forcing the subjects to do it.  Naveh-Benjamin and Jonides quote many studies in the beginning of the paper that disagree on the consequences of maintenance rehearsal - thus even the consequences, let alone the actual process, are "fragile" and in their experiments they force the subjects to rehearse aloud.  In other words, to study rehearsal one has to make it conscious.  If one does not make it conscious one believes it is still there without any direct evidence.  Articulatory suppression in your paper makes the performance degrade but is it ever possible to prove that it is because, and only because, it prevents rehearsal?  

      Then there is the idea that one prevents rehearsal by forcing subjects to repeat some nonsense syllables.  What?  How can one ever show that this the one and only effect of the nonsense syllables?

      I do believe rehearsal exists for very unusual events or very attractive or horrible events and that this makes for long term storage.

      I would be more inclined to believe in a conscious rehearsal effort if somebody did an experiment that pays the subjects nothing, $1 or $10  or $100 for the average number of items remembered (using a non-attractive, detached experimenter actor), records their facial expressions and then does a post-experimental interview about their memory efforts and those efforts show that a conscious rehearsal makes a difference.

      If nobody can show that people use rehearsal without forcing them to do so, the concept should be relegated to the garbage can.

      I have a terrible memory, btw.  Perhaps it is because I do not rehearse...  

      The Atkinson Shiffrin Model: It is wrong, "everybody knows it" but nobody writes it

      Eugen G Tarnow  December 20 2009 01:44:16 PM
      The Atkinson & Shiffrin paper is cited more than 3000 times.  Yet it is incorrect.  But you would not know that reading the literature.  Attached is my paper that tries to set the record straight:  AtkinsonShiffrinIsWrong.pdf

      Of course, the paper is probably not going to be published because it either (1) hurts people's feelings (2) is a negative publication (3) writes something everybody knows (4) goes against the dogma of the field.  Or it could be because Atkinson was the Director of the National Science Foundation, President of the University of California system?  You pick.  Or the paper could simply be wrong or badly organized.  

      Here are the referee reports I got.  What do you think?  Feel free to post your comments. Shiffrin's comment is posted below.  

      Editor:  As part of the initial review process I examined the manuscript to determine if it was appropriate for the journal. Based on this examination, I decided not to send the manuscript out for review because it is not suitable for publication in X.

      I found the manuscript quite interesting and even amusing and think that for the appropriate audience it will be important. However, X is not that audience, as the manuscript is not an example of historical scholarship.

      My decision not to send your manuscript out for review has nothing to do with the quality of your paper. Rather, the decision is based on the fact that where your paper might make its unique contribution would be more appropriate for another journal. Two journals come immediately to mind: Theory and Psychology, and the Journal of Theoretical and Philosophical
      Psychology. I am sure there are others that are more directly linked to the field of memory studies. As such, I would suggest that you consider submitting the manuscript to a journal that focuses more directly on the topic and content of your paper. I realize this judgment is, and always will be, somewhat subjective, but it falls to the Editor of each journal to make such determinations concerning the fit of the paper and content to the role and mission of each journal and regret any
      disappointment this decision may create for you.

      Editor: We have reviewed your manuscript, and we are sorry to inform you that it does not fit our journal's main goals of advancing both theoretical and empirical aspects of psychological processes. Your paper provides a short critique of the Atkinson-Shiffrin model, but it does not incorporate more contemporary works of memory, many of which also analyzed this original model and provided new variants to understanding the complex nature of memory processes.

      Editor Y: Your paper has been thoroughly read and discussed by the editors.  Unfortunately, it is our opinion that the work does not reach the level of conceptual advance that would make it a good candidate for consideration as a commentary in Y Journal.  Please note that this decision does not imply any criticism of the work.

      Editor X:  First let me thank you for submitting your manuscript to X. I was planning to invite reviews from two experts in your research area but had a unusually difficult time finding any volunteers for this paper. All I received until today is the review from Richard Shiffrin (R1), which you find below. Given that you are waiting for a while already, I decided to base this action letter on this one review in my own reading?which turned out to fit with the reviewer's impressions.

      The major problem is, in a nutshell, that there is probably no audience for this contribution?at least among the readers of our journal. The issue that is addressed is too specific and technical to make for a useful historical contribution and it is way too outdated to make a relevant scientific contribution to current thinking. Moreover, the fact that no attention is devoted to all the theoretical developments since 1968 makes it impossible to judge the theoretical implications of your comment. It just comes too late one might say.

      This leaves me no choice but rejecting the manuscript. I'm sorry to be the bearer of such bad news, but I hope you can understand my reasoning. In any case, thank you for choosing X as an outlet of your work, I'm looking forward to further work from your lab.

      Reviewer #1, Rich Shiffrin:

      As much as I like to hear that someone actually read the 1968 paper (not to mention enjoying yet another citation), this strikes me as a 'silly' paper. First, it goes on at length about a minor aspect of the details of the buffer process, many aspects of which have been fleshed out in 40+ years of research since 1968. Second it makes the point that the model is 'wrong' when that is irrelevant: All our models are extremely wrong, and are simply crude approximations that that are meant to aid understanding, and lead to further progress. Of course a 1968 model ought to be even 'cruder' than a more recent one. It is also bizarre to see that someone is hoping to publish a paper that ignores entirely the last 40+ years of research and progress in the field. Finally, my quick reading of the submission revealed a number of places where one could argue about the analyses and interpretations, but given the major problems just noted, it is hardly worth going into these.

      Editor #1: I have read your paper "The Atkinson-Shiffrin model is wrong". I agree with the assertion in your cover letter that most researchers in the field know that there are problems in the Atkinson-Shiffrin model (including the authors of the model) even though it often gets cited inappropriately. The paper offers an interesting analysis of the assumptions of the model. Nonetheless, I am afraid the paper is not appropriate for Psychological Review. As you may know Journal X is the primary outlet for new theories and models across the entire field of Psychology and this paper does not offer theoretical news, even if it may have a message for the world more generally. The competition for our pages is intense, and I must focus on those papers that best fit our mission.

      I'm sorry that my news couldn't be better. Indeed, you may feel that this letter is a bit unfriendly. It isn't intended as such, and I would prefer to think that my rapid response would enable you to more quickly seek an appropriate outlet for your work. I certainly wish you well in this endeavor.

      Reviewer #1: The manuscript is devoted to consideration of the Atkinson-Shiffrin model (A&S) of short term memory. The author formulates that the A&S model is based on four postulates which are ill-defined and contradictory. Also, and the author declares that the four postulates are too many for a good theory. After that the author  considers four "simplified models" where one of four postulates is not valid and demonstrates that each simplified model fails to describe experimental data. Therefore, all four postulates are important. However, the conclusion of the manuscript sounds: "I have shown that the A&S theory is problematic in many ways and in particular does not describe . data". This conclusion sounds like an unmotivated statement. It seems that this conclusion is just relevant to the simplified model.

      Manuscript looks like a collection of contradictory statements and criticism of A&S model. It is not clear what is a goal of the manuscript, what is a novelty, what is a message to the scientific community. Also, the manuscript is poorly organised and many statements are not clear formulated. Thus the manuscript does not satisfy a standard of journal publication. I suggest to reject the manuscript.

      Reviewer #2: In his manuscript, Tarnow provides a critique of Atkinson - Shiffrin theory of memory. In particular, the author cites the vagueness of definition of some concepts, and the lack of a good fit to Murdock's (1962) free recall data, as primary arguments for the critique. However, the manuscript is confused and confusing, and it falls far short of providing even incremental advancement to our existing knowledge.

      The manuscript fails (rather conspicuously) to distinguish between a conceptual framework (Atkinson - Shiffrin theory) proposed to explain a particular phenomenon (free recall) and a quantitative, predictive computational model of the same phenomenon. First, as a mechanistic critique, the manuscript cites that the theory requires "four concepts to fit two parts of a curve" and that the "curves cannot be fit without all these four concepts." Tarnow asserts that "a true test of a theory is whether after fitting it to a data set, it can then predict correctly other relationships of the data." However, this criticism is irrelevant to a conceptual framework but more applicable to a predictive model. (Parenthetically, it appears from the Introduction that the author is more critical of the writing style of Atkinson and Shiffrin than the actual theory.) It is not uncommon, in conceptual theories of behavioral data, that various behavioral phenomena are described/explained using abstract concepts (such as "rehearsal buffer"). However, it should be realized that these concepts are not meant to be solid, well-defined variables in a parametric model or to correspond to actual physiological substrates. They rather serve as a useful starting point for development of more detailed, accurate theories or models. (In fact, the appeal of Atkinson - Shiffrin theory as a simple conceptual starting point, not its formal applicability, is the reason for 3000+ citations.)  Therefore, the author's attempt (and his motive) to criticize Atkinson - Shiffrin theory is confused and confusing.

      Second, it is obvious that all concepts, by design, are required to explain primacy and recency effects; to show that the theory cannot explain the data without one of these concepts makes no sense. In other words, the theory is designed to explain a particular phenomenon, and rendering it incomplete by leaving out an integral part of it will obviously lead to its failure. A more sensible critique would be to question if and how the fundamental concepts of a theory correspond to observable substrates. In fact, all four concepts of Atkinson - Shiffrin theory were questioned, challenged, and/or revised for the last 4 decades. Numerous other researchers have raised similar questions and critiques as Tarnow did in his manuscript (e.g. "why should we start with an empty rehearsal buffer?", or, whether Atkinson - Shiffrin model can explain other aspects of Murdock's free recall data; see, for example, various studies/models of memory by Baddeley's group and Grossberg's group, among others). Nevertheless, while providing criticisms, previous studies also provided quantitative or qualitative revisions or alternatives to the theory to explain further data that cannot be explained/predicted by the original Atkinson - Shiffrin formulation. The current manuscript, however, only points out the phenomena that the theory cannot explain in its original form (which is not novel anyway), and does not provide any alternatives. Therefore, the current manuscript does not offer any insights into short-term memory.  

      Reviewer #3:  First,  the paper is not in APA format. The figures are not placed at the end, sometimes with a strange number (Figure X), axis labels are not displayed correctly (see Fig 2) and the formatting of the figures is not according to the usual standards.

      Second, the text is often unclear and sometimes not grammatical (see p. 2, "I myself is a doubter") .

      Third, the logic is often incorrect. See e.g. p. 2/3 and p. 13. Why should there be no output interference effect when the last presented item is the item recalled first? After all, from the model's point of view, at the start of the recall all 4 items in the buffer are equivalent.

      Fourth, the analyses and the conclusions drawn are often incorrect. For example, the comparison in Table X is clearly biased due to taking the lowest recall probability in the data (hence capitalizing on random error).

      Here are some more referee reports on the next version of this paper:

      Reviewer #1: This article presents challenges to the Modal Model, principally targeting the Atkinson & Shiffrin (1968) paper.

      The main weakness of this article is that it is unbalanced and disconnected from the relevant existing literature. For this reason, its value is unclear in that it is redundant with some pre-existing criticism and it challenges an already weak model. Models that succeed (e.g., by bypassing the entire notion of dual memory stores and buffers, or by specifying what these constructs mean) are not mentioned, which is puzzling. The strengths of the model are not mentioned either, which is equally puzzling.

      Page 3, first paragraph: I do not see the contradiction. Can't one control something one can't observe?

      Page 4, strange criticism of random selection of items to drop out of the buffer. Why should the randomness originate from random fluctuations in firing rate? Wouldn't this kind of (approximately) random process be more related to fluctuations in stimulus properties, context, etc.?

      Page 4, "Biochemical evidence..." - how is this relevant? LTP has nothing to do with the psychologist's notion of "long-term memory", including the Modal Model. This is simply a terminological mismatch. Again it would help to pay attention to what other people have written. Interestingly, LTP does not seem to correspond to any relevant time scale in observed behavior, so it might be a red herring anyway (as many people have suggested).

      Large number of parameters needed to fit the SPC - Great, so see the many ratio-rule models including SIMPLE.

      Page 5, "working memory" is mentioned but never defined. This is a very controversial construct, so it should be treated with care and precision.

      The modeling exercise is interesting and might be instructive, but it is unclear to me whether or not this is novel. I'm sure many people have done this before and very likely published these kinds of simulated exercises.

      Page 12, "intermediate item" - use the conventional term, "asymptote". Which item are you using?

      The model is not fully specified.

      It is unclear why Cowan (2000) of all papers is used to set the buffer size.

      Reviewer #2:
      Page 2, para 1, line 4: no need to point out that Izawa was a former graduate student in same institution as Atkinson & Shiffrin.
      Page 2, para 1, line 6: use of the word "apparently" raises doubt as to the reliability of the statement.
      Page 2, para 3, lines 1-2: Informal style for this type of publication.
      Page 3, para 1, line 1: "this theory has to be well defined" replace "has" with "have".
      Page 3, para 1, line 1: suggest replacing "problem" with "issue".
      Page 3, para 2 and 3 - (for PB) check that his interpretations of the A&S definitions are justified.
      Page 3, para 4, attempting to advise the reader on the attributes of a theory could be considered patronising, suggest rephrasing.
      Page 3, para 4, line 7: "She must have misunderstood...used definition " seems sarcastic and inappropriate for this publication.
      Page 3, para 5: this para suggests that we must always have a partially full RB. However,  this need not be the case, especially in the case of a verbal memory task, in which participants are given time before beginning the expt without any verbal noise.
      Page 4, para 3, l ine7:  "introduces" should be "introduce".
      Page 4, para 3, line 11: "is" should be "are".
      Page 4, para 3, lines 10-11: Operational definitions clearly are helpful in defining theories and the literature indicates that many publications use STM and others STS, whereby STM is the process and STS is perhaps the venue. It's not clear why the author objects to the A & S model using this STS and STM separation.
      Page 4, para 4, line 2: "Bending concepts into a pretzel as is done by A & S", this seems a very sarcastic manner of communicating the authors views to readers and is not helpful.
      Page 4, para 4, line4: "predict" should be "predicts".
      Page 5, para 2, line 4: "ingredient" should be "ingredients".
      Page 5, para 3, line 3: if the presentation rates are 0.5 and 1 second, but the number of seconds allowed to pass between list items is 2 (line 1 of same para), then there are 1.5 and 1 seconds not accounted for in your explanation of the Murdock methodology. If these missing 1.5 and 1 seconds are can be accommodated by the ISI, this should be made clear
      Page 7 , para 1 , line 1: some text missing before "LTS".
      Page7, para 1, line 4: it's unclear how a first in first out algorithm would generate probability recall rates of 50% for items 1-6 and 100% for words 7-10. This needs more specification.  
      Page 7 , para 1, line 5: it's  not clear why  starting with a non-empty buffer would generate the model data in figure 1D.
      Pages 8-10: graphs need legend specification, beyond the first graph.
      Page 12 , para 2, lines 1-3. This text refers to table 1. Where are the starting numbers to apply your definition of primacy strength? It's not clear how the values in the table are generated.
      Page 14, para 1, lines 1-9: the example is specific to list positions 7 & 8, effect are middle serial positions is likely to be quite different.
      Page 18, para 1, lines 1-3, more explanation needed to accompany table 3.
      Page 23, para 1, lines 1-2: use of the phrase "it is a mystery" could be considered inappropriate.
      Page 23, para 2, line 5: "The latter include the failure" - the latter what?  

      General: This paper focuses on trying to disprove the Atkinson and Shiffrin model, but does little to consider other theoretical models that have been published since the seminal Atkinson and Shiffrin paper. There is little or no mention of the Working Memory model (Baddeley & Hitch, 1974), the Act-R model (Anderson & Lebiere, 1998), the SIMPLE model (Brown, Neath & Chater, 2007), the primacy model (Page & Norris, 1999), Positional Distinctiveness model (Naire et al, 1997) etc. The current literature on FR is not fully explained by the A & S model, and consequently more recent models have been suggested, adapted and cast aside in light of further research. It is not clear to me, why the author is attempting to disprove a model which many other researchers feel has been outdated many years ago.
      In addition the writing style of this paper appears somewhat sarcastic through out. The author of this paper typically uses the first person in this paper, when it is common practice to write articles in the third person.  This article is also written in an informal and perhaps more conversational style inappropriate for journal articles.

      Fifth, the phrasing and the tone of the arguments are inappropriate for a journal such as X

      All in all, the problems are such that I have to reject your paper for publication in X.  

      Editor #2: As you might have seen,X is a fine place to publish controversial articles.  But, as you wrote, it is true -- we all know that the Atkinson-Shiffrin model is wrong.  And we all know that math models aren't theories.   The A&S model was a step in a progression.  And math models are one of many tools in our toolbox.  So, I will not send this article out for review.  But thank you for considering X for the publication of your research.